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Multivariate  n-term  rational  and 
piecewise  polynomial  approximation  * 

Pencho  Petrushev 

Department  of  Mathematics,  University  of  South  Carolina, 

Columbia,  SC  29208 


Abstract 

We  study  nonlinear  approximation  in  Lp( Rd)  (0  <  p  <  oo,  d  >  1)  from  (a)  n-term 
rational  functions,  and  (b)  piecewise  polynomials  generated  by  different  anisotropic 
dyadic  partitions  of  Wl: .  To  characterize  the  rates  of  each  such  picewise  polynomial 
approximation  we  introduce  a  family  of  smoothness  spaces  (B-spaces)  which  can  be 
viewed  as  an  anisotropic  variation  of  Besov  spaces.  We  use  the  B-spaces  to  prove  Jack- 
son  and  Bernstein  estimates  and  then  characterize  the  piecewise  polynomial  approx¬ 
imation  by  interpolation.  Our  main  estimate  relates  n-term  rational  approximation 
with  piecewise  polynomial  approximation  in  Lp(W>).  This  result  enables  us  to  obtain  a 
direct  estimate  for  n-term  rational  approximation  in  terms  of  a  minimal  B-norm  (over 
all  dyadic  partitions).  We  also  show  that  the  Haar  bases  associated  with  anisotropic 
dyadic  partitions  of  W1  can  be  successfully  utilized  for  nonlinear  approximation.  We 
give  an  effective  algorithm  for  best  Haar  basis  or  best  B-space  selection. 


1  Introduction 


The  theory  of  univariate  rational  approximation  on  R  is  a  relatively  well  developed  area  in 
approximation  theory  (see,  e.g,  [20]).  At  the  same  time,  the  theory  of  multivariate  rational 
approximation  is  virtually  not  existing  yet.  A  reason  for  this  is  that  it  is  extremely  hard  to 
deal  with  rational  functions  of  the  form  R  :=  P/Q ,  where  P  and  Q  are  algebraic  polynomial 
in  d  variables  ( d  >  1).  Very  little  is  known  about  this  type  of  rational  functions.  It  seems 
natural  to  consider  approximation  from  the  smaller  set  of  n-term  rational  functions  or  atomic 
rational  functions  that  is  the  set  of  all  rational  functions  of  the  form 


R  =  Y,  u 

7=1 


with  r.j  of  the  form 


d 


r(x)  =  ]d 

k= 1 


O'kXk  T  bk 
(xk  -  ak)2  +  j32k 


(1.1) 


As  it  will  be  shown  in  this  article,  this  is  a  powerful  tool  for  approximation  and  at  the  same 
time  it  is  more  tangible  than  the  former. 

*This  research  was  supported  by  ONR/ARO-DEPSCoR  Research  Contract  DAAG55-98-1-0002. 
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It  is  also  interesting  to  consider  approximation  from  multivariate  rational  functions  of  the 
form  R  =  1  rji  where  r?  are  dilates  and  shifts  of  a  single  radial  partial  fraction  such  as 

r(x)  =  V(1  + I  x\2)k.  In  [12],  we  consider  such  approximation  and  prove  a  direct  estimate  in 
terms  of  the  usual  Besov  norm  (exactly  the  same  as  the  one  used  in  nonlinear  approximation 
from  wavelets  or  regular  splines).  To  prove  this  result,  we  first  constructed  good  bases 
consisting  of  dyadic  shifts  and  dilates  of  a  single  rational  function  and  then  utilized  them  to 
nonlinear  approximation. 

In  this  article,  we  take  a  different  approach  to  the  problem.  We  prove  an  estimate 
that  relates  the  multivariate  n-term  rational  approximation  to  a  broad  class  of  nonlinear 
piecewise  polynomial  approximation  in  Lp(Rd )  (0  <  p  <  oo).  In  particular,  this  result  relates 
the  n-term  rational  approximation  to  nonlinear  approximation  from  piecewise  polynomials 
generated  by  any  anisotropic  dyadic  partition  of  Rd.  Then  we  utilize  this  relationship  to 
obtain  an  estimate  for  n-term  rational  approximation  in  terms  of  the  minimal  smoothness 
norm  (over  all  dyadic  partitions).  These  estimates  extend  to  the  multivariate  case  results 
from  [15],  [18]. 

As  a  consequence  of  our  approach,  a  substantial  part  of  this  article  is  devoted  to  nonlinear 
approximation  from  piecewise  polynomials  over  dyadic  partitions  which  are  interesting  in 
their  own  right.  To  the  best  of  our  knowledge  this  problem  was  first  posed  explicitly  in 
§5.4.3  of  [14],  Note  that  we  consider  not  one  but  a  collection  of  approximation  processes 
each  of  them  determined  by  a  dyadic  partition  of  Rd.  The  ultimate  goal  of  the  theory  of 
any  approximation  scheme  is  to  characterize  the  rates  of  approximation  in  terms  of  certain 
smoothness  conditions.  To  characterize  the  rates  of  piecewise  polynomial  approximation 
generated  by  an  arbitrary  dyadic  partition,  we  introduce  a  family  of  new  smoothness  spaces 
(B-spaces)  which  can  be  viewed  as  an  anisotropic  variation  of  Besov  spaces.  We  use  the  B- 
spaces  to  prove  Jackson  and  Bernstein  estimates  and  then  characterize  the  approximation  by 
interpolation.  In  [17],  we  proved  that  in  the  univariate  case  a  scale  of  Besov  spaces  governs  the 
the  rates  of  nonlinear  piecewise  polynomial  approximation.  Similar  Besov  spaces  have  also 
been  used  for  characterization  of  multivatiate  nonlinear  (regular)  spline  Lp-approxi mati on 
in  [5]  (1  <  p  <  oo)  and  [7]  (p  =  oo),  see  also  [5].  Here  we  extend  and  refine  these  results. 

In  addition  to  this,  we  consider  the  library  of  anisotropic  Haar  bases  which  are  naturally 
associated  with  anisotropic  dyadic  partitions  of  Rd.  Since  every  anisotropic  Haar  basis  is  an 
unconditional  basis  in  Lp  (1  <  p  <  oo)  and  characterizes  the  corresponding  B-spaces  (see  §5 
below),  it  provides  an  effective  tool  for  nonlinear  approximation  from  piecewise  constants. 
Moreover,  as  we  show  in  §5,  in  a  natural  discrete  setting,  there  is  a  practically  feasible 
algorithm  for  best  Haar  basis  or  best  B-space  selection  for  any  given  function.  In  this  way, 
the  approximation  procedure  can  effectively  be  completed. 

A  leading  idea  in  this  article  is  that  the  classical  smoothness  spaces  are  not  suitable 
of  measuring  the  smoothness  of  the  functions  in  highly  nonlinear  approximation  such  as 
multivariate  rational  or  piecewise  polynomial  approximation.  More  sophisticated  means  of 
measuring  the  smoothness  are  needed.  We  believe  that,  in  some  cases,  the  smoothness  should 
be  measured  by  means  of  a  collection  of  smoothness  space  scales  (like  the  B-spaces). 

The  outline  of  the  article  is  the  following.  In  §2,  we  introduce  the  B-spaces  and  establish 
some  of  their  basic  properties.  In  §3,  we  prove  Jackson  and  Bernstein  estimates  and  then 
characterize  the  nonlinear  piecewise  polynomial  approximation  generated  by  an  arbitrary 
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anisotropic  dyadic  partition  of  Rd.  In  §4,  we  prove  an  estimate  that  relates  the  n-terrn  ratio¬ 
nal  approximation  to  nonlinear  piecewise  polynomial  approximation  and,  as  a  consequence, 
we  obtain  a  direct  estimate  for  rational  approximation  in  terms  of  the  minimal  B-norm. 
Section  5  is  devoted  to  the  anisotropic  Haar  bases.  We  give  an  algorithm  for  best  Haar 
basis  or  best  B-space  selection.  In  §6,  we  present  our  view  point  on  some  of  the  principle 
questions  concerning  nonlinear  approximation  and  pose  some  open  problems.  Section  7  is 
an  appendix,  where  we  give  the  proofs  of  some  auxiliary  statements  from  §2  and  the  lengthy 
proof  of  an  interpolation  result  from  §3. 

Throughout  this  article,  the  positive  constants  are  denoted  by  c,  Ci, . . .  and  they  may 
vary  at  every  occurrence,  A  ~  B  means  c,\B  <  A  <  o>B:  II/.  denotes  the  set  of  all  algebraic 
polynomials  in  cl  variables  of  total  degree  <  k.  For  a  set  E  C  Rrf,  1®  denotes  the  characteristic 
function  of  E ,  and  \E\  denotes  the  Lebesgue  measure  of  E. 

2  B-spaces 

In  this  section,  we  introduce  a  family  of  smoothness  spaces  (B-spaces)  which  will  be  used 
for  characterization  of  nonlinear  piecewise  polynomial  approximation  (§3,  §5)  and  in  n-tem 
rational  approximation  (§4).  These  spaces  can  be  defined  on  Rd  {cl  >  1)  or  on  an  arbitrary 
box  Q  in  Rd.  For  convenience,  we  shall  only  consider  the  case  when  |fl|  =  1  and  Q  is  with 
sides  parallel  to  the  coordinate  aces.  We  shall  define  the  B-spaces  by  using  local  polynomial 
approximation  over  boxes  from  nested  anisotropic  dyadic  partitions  of  Ml1  or  Q. 

•  Anisotropic  dyadic  partitions  of  Rd  or  Q.  We  call 

V=  |J  Vm 

7n£Z 

a  dyadic  partition  of  Rrf  with  levels  {' Pm }  if  the  following  conditions  are  fulfilled: 

(a)  Every  level  Vm  is  a  partition  of  Rrf:  Rd  =  U/e-pm  I  and  Vm  consists  of  disjoint  dyadic 
boxes  of  the  form  I  =  x  . . .  x  Id,  where  each  I,  is  a  semi-open  dyadic  interval 
(Xj  =  [{v  -  1)2",  i/2")),  and  |/|  =  2~m. 

(b)  The  levels  of  V  are  nested,  i.e. ,  Vm+i  is  a  refinement  of  Vm.  Thus  each  I  £  Vm  has 
two  children,  say,  Ji,  J2  £  V„, ,  i  such  that  I  =  J\  U  J2  and  J\  fl  J2  =  0. 

(c)  For  any  boxes  I"  £  V  there  exists  a  box  /  £  V  such  that  V  U  I"  C  /. 

Also,  we  call  V  =  Um>o  a  dyadic  partition  of  S2  (|0|  =  1)  if  Vo  :=  {0}  and  the  levels 
{Pm}m> i  satiffy  conditions  (a)-(b)  from  above  with  Rd  replaced  by  Q. 

The  next  few  remarks  will  help  to  understand  better  the  nature  of  dyadic  partitions.  First, 
condition  (c)  above  is  not  very  restrictive  but  it  prevents  Vm  from  possible  deteriorations  as 
m  — >■  — oo.  This  condition  implies  that  in  each  dyadic  partition  V  of  there  is  a  single  tree 
structure  with  respect  to  the  inclusion  relation. 

We  note  that  the  two  children,  say,  Ji,  J2  £  Vm+\  of  any  I  £  Vm  can  be  obtain  by 
splitting  I  in  two  equal  subboxes  in  cl  {cl  >  1)  different  ways.  Therefore,  there  is  a  huge 
variety  of  anisotropic  dyadic  partitions  V  of  Rd  or  Q. 
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A  dyadic  partition  of  any  box  can  easily  be  obtained  inductively  (by  successive  subdi¬ 
viding).  For  instance,  suppose  we  want  to  subdivide  Q.  Assume  that  the  levels  {Vj}o<j<m 
have  already  been  defined.  We  now  subdivide  each  box  I  G  Vm  by  ’’halving”  I  in  one  of  the 
cl  coordinate  directions,  thus  obtaining  two  new  dyiadic  boxes  which  we  include  in  Vm+  \  . 
We  process  in  the  same  way  all  boxes  from  Vm  and  as  a  result  obtain  the  next  level  Vm+i  of 
dyadic  boxes. 

To  construct  an  anisotropic  partition  V  of  Mrf,  one  can  proceed  as  follows:  First,  cover  Rd 
by  a  growing  sequence  of  dyadic  boxes  Iq  C  I\  C  . . .,  \Ij\  =  2J,  M.d  =  (J;  /;.  starting  from 

an  arbitrary  dyadic  box  J0  and  growing  the  consecutive  boxes  infinitely  many  times  in  all 
four  directions.  Second,  subdivide  each  box  Ij  and  its  sibling  (contained  in  Ij+i)  as  above. 

A  typical  property  of  the  anisotropic  dyadic  partitions  is  that  each  level  Vm  of  such  a 
partition  V  consists  of  dyadic  boxes  I  with  |/|  =  2~m  and  at  the  same  time  there  could  be 
extremely  (uncontrolably)  long  and  narrow  boxes  in  Vm . 

•  Local  polynomial  approximation.  Fix  a  box  I  c  and  let  /  €  LP(I).  Then 

E„(f,I)p~  inf  (1/  -  P[\l,(D  (2.1) 

is  the  error  of  Lp(I )  approximation  to  /  from  IT^.,  the  set  of  all  algebraic  polynomials  of 
degree  <  k.  The  modulus  of  smoothness  u)k(f,  I)p  is  defined  as  usual  by 

uk{fJ)p  ■=  sup  \\Akh{fr)\\Lp(I),  (2.2) 

heSLd 

where  A \{f,x)  is  the  A:tli  difference  with  step  h  G  M>d  and  A|(/,  x)  :=  0  if  the  segment 
[x,x  +  kh]  is  not  entirely  contained  in  /. 

We  shall  need  the  fact  that  Ek(f,I)p  and  cj^(/,  I)p  are  equivalent: 

Ek(f,I)pnwk(f,I)p  (2.3) 

with  constants  of  equivalence  depending  only  on  p.  A:,  and  cl.  Equivalence  (2.3)  follows  from 
the  case  when  I  =  [0,  l)d  by  a  simple  change  of  variables;  the  upper  estimate  is  Whitney’s 
theorem  (see  [2]  if  p  >  1  and  [22]  if  0  <  p  <  1)  and  the  lower  estimate  follows  by  the  fact 
that  A p(P,x)  =  0  if  P  G  nfc. 

We  shall  often  use  the  following  lemma  which  establishes  the  equivalence  of  different 
norms  of  polynomials  over  different  sets. 

Lemma  2.1.  Suppose  R  :=  /  \  J ,  cohere  J  C  /  and  I,  J  are  dyadic  boxes  in  M.d  or  J  =  0. 
Let  I'  C  R  be  also  a  dyadic  box  with  \I'\  =  |/|/2.  Then,  for  each  polynomial  P  G  If*,  and 
0  <  T.  p  <  oo, 

\\P\\lp(i)  ~  ||-P||lp(b)  ~  ||-P||lp(/'}  (2.4) 

and 

ll^lltdR)  *  \R\1/T-1/r\\P\\L,{R)  (2.5) 

with  constants  of  equivalence  depending  only  on  p,  t,  k,  and  cl. 
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Proof.  This  lemma  follows  immediately  from  the  obvious  case  I  =  [0,1  )d  (all  norms  of  a 
polynomial  are  equivalent)  by  change  of  variables.  □ 

We  find  useful  the  concept  of  near  best  approximation  which  we  borrowed  from  [8].  A 
polynomial  Q  eUk  is  said  to  be  a  near  best  Lp(I )  approximation  to  /  from  11^  with  constant 
A  if 

1/  Q\  r.An  <  •  i /•■/,  (/• /),- 

Note  that  if  p  >  1,  then  a  near  best  Lp(I )  approximation  Q  :=  Qi(f)  from  IT*,  can  be  realized 
by  a  linear  projector. 

Lemma  2.2.  Let  0  <  q  <  p  and  let  Qi  be  a  near  best  Lq(I )  approximation  to  f  from  II*.. 
Then  Qi  is  a  near  best  Lp(I )  approximation  to  f  from  If^.. 

Proof.  See  [8].  □ 

•  Definition  of  B-spaces  on  Rd.  Let  V  be  an  arbitrary  anisotropic  dyadic  partition  of 
(d  >  1),  a  >  0,  0  <  p,  q  <  oo,  and  k  >  1.  We  define  the  B-space  Bp^{fP)  as  the  set  of  all 
functions  /  G  Lp(Rd )  such  that 

ll/lutfOT  -  (£[£  (ti ■vt(/,/)1>)T/'>)1/''  =  <£P“(  £  MSJ)i)Vp]-)l'q  (2.6) 

m&  I^Pm  mdz'Zi  IdzPm 

is  finite,  where  the  iq- norm  is  replaced  by  the  sup-norm  if  q  =  oo  as  usual.  From  (2.3),  it 
follows  that 

ll/ll  W  :=  (£[£  (\I\-‘‘Ek(fJ)rY)q/r)1/--  (2.7) 

m&  iQfPm 

We  now  introduce  the  linear  piecewise  polynomial  approximation  generated  by  V .  Let 
S^n  :=  <S^(P)  be  the  set  of  all  piecewise  polynomials  of  degree  <  k  on  boxes  I  G  PTO,  that 
is,  S  G  S ^  if  S  =  )£/g.pm  1/  •  P/,  where  Pi  EUk.  Evidently,  . . .  C  S C  C  C  . . . .  We 

denote  _ 

Lp  :=  Lp(P,  A:)  :=  v 

where  the  closure  is  taken  in  Lp(Rd).  Evidently,  Lp  is  a  subspace  of  Lp  and 

Lp  =  span  {1/  •  P/  :  P/  G  Uk,  I  G  P}, 

where  “span”  means  “closed  span  in  Lp  .  We  denote  by  Sfn[f)p  :=  Sfn[f,V)p  the  error  of 
Lp  approximation  to  /  from  Sfn .  i.e. ,  Sfn(f)p  :  =  infSe5t  [|/  —  S\\p.  Clearly,  if  /  G  Lp,  then 
/  G  Lp  if  and  only  if  limTO_!.00  Sfn(f)p  =  0.  It  may  happen  that  Lp(P,  k)  ^  Lp.  However,  if 
supjdiam  (I)  :  I  G  Vw }  — >■  0  as  m  — >■  0,  then  Lp(P,  k)  =  Lp. 

Clearly,  by  (2.7), 

-V(/,P)  =  (£(2””S‘ U,V)t)-E“.  (2.8) 

Therefore,  the  B-spaces  Bpjf(V)  are  approximation  spaces  generated  by  {Sfn(f ,V)P}  (com¬ 
pare  with  the  definition  in  (3.6)). 

By  (2.8),  if  /  G  Bp]f(T ),  then  S^l(f,V)p  — >■  0  as  m  — >■  — oo,  which  together  with  condition 
(c)  on  dyadic  partitions  implies  that  ||/||  «;>'•(' -p)  =  0  if  and  only  if  /  =  0  a.e.  (see  the  proof  of 
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Theorem  2.4  in  the  Appendix).  Therefore,  ||  •  Wh^I/P)  is  a  norm  if  p,  q  >  1  and  a  quasi-norm 
otherwise.  For  the  remainder  of  this  article,  “norm”  will  stand  for  “norm”  or  “quasi-norm” . 

Let  Qi^{f)  be  a  polynomial  of  near  best  LV{I)  approximation  to  /  from  fl^  with  some 
constant  A  (the  same  for  all  I  G  V).  Note  that  Qi^(f)  can  be  defined  as  a  linear  projector 
if  r/  >  1.  Then  Tm^(f)  :=  Tm^(f,V)  :  =  Y2iev  a  near  ^es^'  approximation  to 

/  from  Srkn.  We  define 


'■=  imM?)  '■=  TmM )  ~  Tm-l,n{f)-  (2.9) 

We  now  introduce  a  new  norm  in  Bfk(V)  by 

N2{f,V)  :=  {^2^arn\\tm,v{f)\\P)q)1/q,  where  0  <  r)  <  p.  (2.10) 

m£Z 

Lemma  2.3.  The  norms  ||  •  \\Ba^V),  Ni(-),  and  N2(-)  are  equivalent  with  constants  of  equiv¬ 
alence  independent  ofV. 

Proof.  The  equivalence  of  ||  •  \\b^(v)  and  iVi(-)  has  already  been  indicated  in  (2.7). 

Now,  we  show  that  Ar1(-)  &  N2(-).  Let  W(/)  <  oo.  By  Lemma  2.2,  Qi^(f)  is  a  near 
best  Lp(I)  approximation  to  /  from  11^  and  hence  \\f  —  Tm^(f)\\p  <  cS^l(f)p.  Therefore, 

II  W/)llp  <c\\f-  TmM)\\p  +  c\\f-  Tm.^{f)\\p  <  cSkm(f)p  +  cS*  _!(/),. 

This  implies  N2{f)  <  cNi(f). 

In  the  other  direction,  if  N2(f)  <  oo,  then  it  is  easily  seen  that 

/  AL  \  i/a 

Sm(f)p  <  Wf-TmAr  <  (  ll^lU)  ’  A  :=  min{p,  1}.  (2.11) 

j  =m+ 1 

To  complete  the  proof,  we  need  the  following  discrete  Hardy  inequality:  If  {xmjmEz  and 
{ymjmez  are  two  sequences  of  nonnegative  numbers  such  that  ym  <  xj  )^Ai  A  >  0, 

then 

Y,c2m°y„r < njra  «. « > o.  (2.12) 

m.£Z 

where  c  =  c(\,a,q).  This  inequality  follows  easily  by  Holder’s  inequality.  We  use  (2.8), 
(2.11),  and  (2.12)  to  obtain  W(/)  <  cN2(f).  Therefore,  Nt(f)  N2(f).  □ 

•  The  B-spaces  B^k{fP)  on  Rd.  For  the  purposes  of  nonlinear  piecewise  polynomial  and 
n-terrn  rational  approximation,  we  shall  only  need  a  specific  class  of  B-spaces,  namely,  the 
spaces  Therefore,  for  the  rest  of  this  section,  we  focus  our  attention  exclusively  on 

these  specific  B-spaces. 

We  shall  always  assume  that  0  <  p  <  00,  a  >  0,  k  >  1,  and  r  is  defined  by  1  jr  :=  a+l/p. 
We  shall  briefly  denote  the  B-space  Bff  (V )  by  Bfk(V)  or  simply  by  Bf.  By  the  definition 
of  B-spaces  in  (2.6),  we  have 

ll/ll#w  :=  (5j(|/|-“^(/,/)T)T)1/T  (2.13) 

lev 
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and,  using  Lemma  2.3, 

n/llwl»iv2(/,p):=(^(|/r”i|t,,„(/)i|Tni/*  if  o<,,<t,  (2.i4) 

lev 

where  £/,„(/)  :  =  1/  •  Hie  Vm,  m  G  Z. 

In  some  instances,  the  5“-norms  from  (2.13)-(2.14)  are  not  quite  convenient  since  the 
LT-norm  which  they  involve  is  not  very  friendly  when  r  <  1.  This  is  the  case  when  the 
smoothness  parameter  a  >  1.  We  next  show  that  this  drawback  of  the  above  norms  can  be 
overcome.  We  introduce  the  following  new  B-norms:  For  /  G  Lv,  0  <  i)  <  p,  we  set 

AC,,(/,(P)  :=  (£(|/|1/'-1/’W(/,/)„n1/'-  (2.15) 

lev 

and 

KMV)  :=  (^(|/r/!'-1/lfI.,(/)||,)r)1/T,  (2.16) 

lev 

where  £/,„(/)  :=  17  ■tm^{f,'P)  if  I  G  Vm,  m  G  Z  (see  (2.9)).  Note  that  A fu,T{f,V)  = 
H/IU?*(-p)-  Using  (2.5)  and  the  relation  1/r  =  ck  +  1/p,  we  readily  obtain 

jV(.,(/^)“(Eiii'4i(/)iif)T)1/T-  <2-17> 

lev 

The  following  embedding  theorem  will  be  important  for  our  further  developments. 
Theorem  2.4.  If  f  G  Lv,  0  <  7)  <  p  <  oo,  and  M, „(/,?)  <  °°i  then 

f  =  a.e.  onRd  (2.18) 

meZ 

with  the  series  converging  absolutely  a.e.,  and 

ll/ll,  <  HElW/)lllp<^(/.n  (2.19) 

mez 

where  c  =  c(a,  k.  p.  d,  rj). 

We  shall  deduce  this  theorem  from  the  following  more  general  embedding  theorem: 

Theorem  2.5.  Let  1  <  p  <  oo.  Suppose  {$,„}  is  a  sequence  of  functions  on  Rd  with  the 
properties: 

(i)  $TO  G  Too,  supp  $m  C  Em  with  0  <  \Em\  <  oo  and 

11$  II  <  ci  I E  l_1/,p||(|)  || 

1 1  1 1  oo  _  1 1  \ \p- 

(ii)  If  x  G  Em,  then 

£  {\E,n\l\E,\p-  <  cu 

Ej3x.  |  Ej  |  y  |  Em  | 

where  the  summation  is  over  all  indices  j  for  which  Ej  satisfy  the  indicated  conditions.  Then 
we  have 

ii£ii»3(-)iiiP<c(£ii4>iP1/*,  0 <t<p. 

3  j 

where  c  =  c(p ,  r.  ri ). 
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To  avoid  nonnecessary  technicalities  at  this  early  stage,  we  shall  give  the  proofs  of  The¬ 
orems  2. 4-2. 5  as  well  as  the  one  of  the  next  theorem  in  the  Appendix. 

Theorem  2.6.  The  norms  ||  •  |  A (0  <  r)  <  p),  and  Aft^(-,T)  (0  <  r)  <  p), 

defined  in  (2.13)  and  (2.15)-(2.16),  are  equivalent  with  constants  of  equivalence  depending 
only  on  a,  k,  p,  cl,  and  t).  Furthermore,  the  equivalence  of  ||  •  \\Bak^  and  Afw,n(',P)  is  no 
longer  valid  if  7)  >  p. 

•  B-spaces  on  f2.  We  shall  only  define  the  B-spaces  Bf  k(V)  on  Q  which  we  need  in  nonlinear 
piecewise  polynomial  and  rational  approximation.  The  more  general  B-spaces  Bfk(V)  on  Q 
can  be  introduced  in  a  obvious  way. 

We  again  assume  that  0  <  p  <  oo,  a  >  0,  k  >  1,  and  1/r  :=  a  +  1/p.  Let  V  =  Um>o  Wr> 
be  an  arbitrary  dyadic  partition  of  12  ( 1 0|  =  1).  We  dehne  the  space  Bf  :=  Bfk(V )  as  the 
set  of  all  /  G  Lr(Q)  such  that 

l/k*m  :=  (^(|/ra^(/,/W01/T  <  OO.  (2.20) 

lev 

Evidently,  \f  +  P\b?  =  f I R°  for  P  G  11^  and  hence  •  \H»  is  a  semi-norm  if  r  >  1  and 
a  semi-quasi-norm  if  r  <  1. 

By  Theorems  2. 7-2. 8  below,  if  /  G  Bfk(V )  then  /  G  Lp(Q).  Therefore,  it  is  natural  to 
dehne  a  norm  in  Bfk(V)  by 


\f\\B»k(V)  I  f\  I  ,,(11)  +  | /U“*(-P)- 


(2.21) 


Similarly  as  in  (2.8),  we  have 

H/IU-OT  “  ll/lt  +  (E(2“’"S»(/.P),)I)1/T.  (2-22) 

meZ 

where  Skn[f,V)T  is  the  error  of  linear  piecewise  polynomial  approximation,  defined  similarly 
as  in  the  case  of  B-spaces  on  Mf  (see  the  definition  above  (2.8)). 

In  analogy  to  (2.15),  we  introduce  a  more  general  norm  by 

KMV)  ■.=  ll/ll,  +  (E(l/l1/l’"1/,w‘(/./WT)1/'»  0<n<p.  (2.23) 

lev 

Also,  similarly  as  in  the  definition  of  B-norms  on  (see  (2.9),  (2.14)),  we  dehne  the  oper¬ 
ators:  QIyr,{f  j,  Tm,v{f)  :=  Tm^(f,V),  tm^(f)  :=  tm^{f,V)  (m  >  0),  and  £/,„(/),  f  G  L„(fi), 
with  the  natural  modification  T_i^(/)  :=  0,  i.e. ,  t-o^if)  ■=  To,r/{f)  ■=  Qo, ??(/)•  We  dehne 
another  norm  by 

A UVP)  :=  (E(l/I1/''_1/’'IIV,(/)II,)T)1/T  «  (E  IIW/IIIpDA  0  <  >)  <J>-  (2-24) 

lev  lev 

Theorems  2.4  implies  immediately  the  following  analogue  of  Theorem  2.5: 
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Theorem  2.7.  If  f  G  Lfikl),  0  <  7]  <  p  <  oo,  and  fift.fif.V)  <  oo,  then 

f  =  £W/)  absolutely  a.e.  and  \\f\\p  <  ||  |iTO,„(/)|||p  <  cJ\ft^(f,V). 

m> 0  m> 0 

We  proceed  similarly  as  in  the  proof  of  Theorem  2.6  (see  the  Appendix)  to  prove  the 
equivalence  of  the  above  defined  B-norms: 

Theorem  2.8.  The  norms  ||  •  || (0  <  i]  <  p),  and  fift,v(-,V)  (0  <  r)  <  p), 
defined  in  (2.21)-(2.24),  are  equivalent  with  constants  of  equivalence  depending  only  on  a,  k, 
p,  d,  and  7). 

•  Comparison  of  B-spaces  with  Besov  spaces.  We  first  recall  the  definition  of  Besov 
spaces  on  E  =  M.d,  E  =  [a,  b]d  or  on  a  Lipscliitz  domain  E  C  (cl  >  1).  The  Besov  space 
Bsq(Lp )  :=  Bq(Lp(E )),  s  >  0,  1  <  p,  q  <  oo,  is  defined  as  the  set  of  all  functions  /  G  LP(E ) 
such  that 

a  00  dt\  V<? 

(t  sujk(f,t)p)qjJ  <  oo  (2.25) 

with  the  L?-norm  replaced  by  the  sup-norm  if  q  =  oo,  where  k  :=  [s]  +  1  and  iOk(ffi)p  is 
the  A’-tli  modulus  of  smoothness  of  /  in  Lp(E).  The  norm  in  Bq(Lp)  is  usually  defined  by 
||/||b|(lp)  :=  \\f\\p  +  \f\Bsq(Lpy  It  is  well  known  that  if  in  (2.25)  A:  is  replaced  by  any  other 
A:  >  [s]  +  1,  then  the  resulting  space  would  be  the  same  with  an  equivalent  norm.  However, 
the  situation  is  totally  different  when  p  <  1  and  this  is  the  reason  for  introducing  k  as  a 
parameter  of  the  Besov  spaces  with  the  next  definition.  We  define  the  space 

Bqk(Lp)  :=  Bqk(Lp(E)),  0  <  p,q  <  oo,  s  >  0,  k  >  1,  (2.26) 

as  the  Besov  space  Bsq(Lp(E ))  from  above,  where  the  parameters  k  and  s  are  already  set 
independent  of  each  other. 

For  the  theory  of  nonlinear  (regular)  spline  approximation  in  Lp(E),  0  <  p  <  oo,  one  can 
utilize  the  Besov  space 

BdTa'k(LT)  :=  BdTa'k(LT(E)) 

with  parameters  set  as  elsewhere  in  this  article:  A:  >  1,  a  >  0,  and  1/r  :=  a  +  1/p  (see  [17] 
when  cl  =  1,  and  [5],  [7]  when  cl  >  1).  Since  Bda'k(LT )  is  embedded  in  Lp,  it  is  natural  to 
define  a  norm  in  Bfa'k(Lr)  by  ||/||Bda,*^L  ^  :=  ||/[|p  +  \f\Bda,k^L  y  In  the  following,  we  shall 
restrict  our  attention  to  the  case  E  =  (cl  >  1). 

We  call  a  dyadic  partition  V  of  regular  if  there  is  a  constant  K  >  2  such  that  for  each 
box  /  =:  Ii  x  . . .  x  Td  from  V  we  have  K~L  <  \Lv\/\Lfi\  <  IT,  1  <  v.  p  <  cl. 

Now,  if  V  is  a  regular  dyadic  partition  of  and  /  G  Bda'k(Lr),  then  /  G  Bfk(V)  and 

I  /I  H'UpP)  <  C\\f\\gda,k^y 

which  easily  follows  using  the  following  equivalence: 

MfJYr~^-[  I  |A  kh(f,x)\Tdxdh,  IeV ,  (2.27) 

1  J{0,l(I)]dJlkh 
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where  I*/,,  '■=  {x  £  I  :  [x,  x  +  kh]  C  /}  and  1{I)  is  the  maximal  side  of  /  or  diam  (I)  (see  [20] 
for  the  proof  of  (2.27)  in  the  univariate  case;  the  same  proof  applies  to  the  multivariate  case 
as  well).  Notice  that  the  smoothness  parameters  of  B-spaces  and  Besov  spaces  above  are 
normalized  differently.  Thus  the  B-space  Bfk(T)  corresponds  to  the  Besov  space  Bpk(Lr) 
with  s  =  da. 

Using  the  idea  of  the  proof  of  Theorem  2.6  in  the  Appendix,  one  can  easily  prove  that, 
for  a  regular  dyadic  partition  V. 

Bda'k(Lr(Rd))  =  B?(V),  if  0  <  a  <  1/p,  (2.28) 

with  equivalent  norms,  and  this  is  no  longer  true  if  a  >  1/p ,  Bfk(V)  is  much  larger 
than  Bda'k(LT(Rd ))  in  this  case.  A  key  fact  here  is  that,  for  each  /  G  V  and  a  >  1/p , 
lll/ll^da,*^  -j  =  oo,  while  at  the  same  time  ||l/||s«*(-p)  ~  ||l/||p-  The  same  is  true  if  1/  is 
replaced  by  P  ■  \j,  P  £  II*.,  P  ^  0. 

Suppose  now  that  V  is  an  arbitrary  dyadic  partition  of  Rd.  As  we  mentioned  in  §2, 
extremely  long  and  narrow  boxes  may  occur  at  any  level  and  location  of  V .  Straight¬ 
forward  calculates  show  that,  for  such  a  box  I  £  V  even  ifO  <  a  <  1/p  and  a  is  as 
small  as  we  wish  (fixed),  ^/||l/||p  can  be  enormously  (uncontrolably)  large,  while 

1.  This  is  why  the  Besov  spaces  are  completely  unsuitable  for  the  theory 
of  piecewise  polynomial  approximation  generated  by  anisotropic  dyadic  partitions  (see  also 
the  results  of  §3  below).  The  situation  is  quite  similar  when  comparing  two  B-spaces  over 
completely  different  dyadic  partitions. 

3  Nonlinear  piecewise  polynomial  approximation 

In  this  section,  we  shall  use  the  B-spaces  introduced  in  §2  to  characterize  the  nonlinear 
piecewise  polynomial  approximation  generated  by  an  arbitrary  dyadic  partition  V  of  Rd. 
The  same  results  with  almost  identical  proofs  hold  on  any  box  Q. 

We  let  T,k(V)  ( k  >  1)  denote  the  nonlinear  set  consisting  of  all  piecewise  polynomial 
functions  _ 

p  =  ^  1/  •  Pj, 

ie  An 

where  P*  £  II*.,  An  C  V,  and  #An  <  n.  We  denote  by  <Jn{f,V)p  :=  crk(f,T)p  the  error  of 
Lp  approximation  to  f  £  Lp(Rd)  from  T,k(V): 

°n{f,V)p:=  inf  ||/  —  p\\p. 

ves*(7>) 

We  next  prove  Jackson  and  Bernstein  estimates  for  the  above  nonlinear  approximation.  Then 
the  desired  characterization  of  the  approximation  spaces  follows  immediately  by  interpola¬ 
tion.  Throughout  this  section,  we  assume  that  V  is  an  arbitrary  dyadic  partition  of  Rd, 
0  <  p  <  ocjj  a  >  0,  k  >  1,  and  1/r  :=  a  +  1/p. 

Theorem  3.1.  If  f  £  Bfk(V),  then 

Vn(f,V)p  <  cn~a\\f\\B?k(T),  n  =  1,2,..., 

with  e  =  c(a,p,  A: ,  d). 
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Proof.  By  Theorem  2.4,  /  can  be  represented  in  the  form 


/  =  a.e.  on  Rd  (3.1) 

lev 

with  the  series  converging  absolutely  a.e.,  where  f/  =  1/  •  Pj  with  Pj  £  II/.  (tj  :=  1/  •  tm^  if 
/  £  ,  0  <  rj  <  p).  In  addition  to  this,  by  Theorem  2.6, 

ii/iu-(«  “  (E  IW,T  =:  w>. 

lev 

Case  I:  1  <  p  <  oo.  We  define  ^  :=  {I  £  V  :  2 _/W(/)  <  ||t/||p  <  2_/J+1Ar(/)}.  Clearly, 

<  2'W  (3.2) 

We  define 

9n  ■=  ti-  gl :=  5Z  and  :=  5Z  v 

iejp.  iej ^  v<m 

We  have  Gm  £  Y>kM(T)  with  M  :=  ^/(<m  2IIT  =  c2mT.  We  use  (3.1),  (3.2),  and  Lemma  7.1 
(as  in  the  proof  of  Theorem  2.5)  to  obtain 

OO  OO 

VM(f,'P)p  <  ||  t:  if/iiip<n  9fi\\p< 

/G'p\U„<m^  /i=m+l  p—m+ 1 

OO  oo 

<  C  5]  2 -^(fMJ,)1/p<cAf(f)  2_At(1'T/p) 

/j=m+l  /a=m+l 

<  cW(/)2-m(1-T^  =  cM-W+^Wt/)  =  cM~aN(f) 
which  implies  the  theorem  in  Case  I. 

Case  II:  0  <  p  <  1.  We  let  Ijf/Jlp  >  \\ti2\\p  >  ...  be  a  nonincreasing  rearrangement  of  the 
sequence  {||f/||p}  and  define 

n 

1=1 

To  estimate  \\f  —  ^||p  we  shall  use  the  following  simple  inequality:  If  x\  >  >  . . .  >  0  and 

0  <  r  <  p,  then 

OO  OO 

(  ^  ^)1/p  <  n1tp~1/T(^2xrj)llT. 

j-n+l  j= 1 

We  obtain 

OO  OO  OO 

II  f~vt  <  II  E  Ml»  <  (  E  IMP17’  <  cn'fr-V^E  ||«fj||;)1/T  <  cn-"||/||BflOT.  □ 

j=n+l  j=n+l  1=1 
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Theorem  3.2.  If  p>  G  Ejj(P),  then 


(3-3) 


l(plUpCP)  <  cnQ||v?||p 

with  c  =  c(a,p ,  A:,  <7). 

Proof.  Let  p  =  Y2ieA  1/  •  P/,  where  P/  G  II*,,  A  C  P,  #A  <  n,  n  >  1.  To  prove  (3.3),  we 
shall  use  the  natural  tree  structure  in  P  induced  by  the  inclusion  relation:  Each  box  /  G  P 
has  two  children  (boxes  Ji,  J2  C  /  such  that  /  =  Ji  U  J2  and  |  Jf\  =  |  J2|  =  (1/2) |/|)  and  one 
parent  in  P.  Let  Jo  G  P  be  the  smallest  box  containing  all  boxes  from  A  and  let  T  be  the 
minimal  binary  subtree  of  P  containing  A  U  { Po }  -  So,  T  is  the  set  of  all  boxes  in  P  which 
contain  at  least  one  box  from  A  and  are  contained  in  J0.  We  introduce  the  following  subsets 
of  T: 

(i)  T1  the  set  of  all  final  boxes  in  T  (boxes  not  containing  other  boxes  from  T), 

(ii)  T2  the  set  of  all  branching  boxes  in  T  (boxes  with  both  children  in  T)  and,  in  addition, 
we  include  Jo  in  T2, 

(iii)  T3  the  set  of  all  children  of  branching  boxes  in  T, 

(iv)  T4  the  set  of  all  chain  boxes  in  T  (boxes  with  exactly  one  child  in  T),  excluding  /0 
if  J0  has  only  one  child  in  T- 

Obviously  T1  C  A  and  hence  #T2  <  #P[  <  n  and  #T3  <  2n.  Note  that  #T4  can  be 
much  larger  than  #A. 

The  sets  A  and  T  generate  a  natural  subdivision  of  p  into  a  union  of  disjoint  rings. 
By  definition,  P  is  a  ring  if  R  =  I  \  J  with  I  G  P  and  J  G  P  or  ./  =  0.  We  say  that 
R  =  I  \  J  is  a  maximal  ring  if  (a)  /  G  T  and  ./  G  T  or  ./  =  0,  (b)  P  does  not  contain  boxes 
from  A  which  are  smaller  than  /,  and  (c)  R  is  maximal  with  these  two  properties  (R  is  not 
contained  in  another  such).  We  denote  by  IZ  the  set  of  all  maximal  rings  (generated  by  A). 
For  R  G  P,  we  denote  by  IR  and  -Jr  the  defining  boxes  of  P,  that  is,  P  =:  Ir  \  Jr  with 
IReT  and  JR  G  T  or  JR  =  0.  Going  further,  we  denote  IZm  :=  {P  G  IZ  :  |/R|  =  2-m). 
Then  IZ  =  Clearly,  IZ  consists  of  disjoint  subsets  of  J0  and  J0  =  P.  It  is 

readily  seen  that  for  each  P  C  IZ.  we  have  IR  G  T1  or  JR  G  T3  or  IR  G  T  H  A  or  IR  =  J0. 

Therefore,  #71  <  #T1  +  #T3  +  #A  <  4n. 

Also,  we  introduce  subrings  (of  maximal  rings).  Suppose  P  G  IZ  and  P  =  Ir  \  Jr  with 
Ir  G  Vi,  Jr  G  P/+/(  (ji  >  1).  Clearly,  for  each  I  <  rn  <  I  +  ji.  there  exists  a  unique  P  G  Vm 
such  that  Jr  C  P  C  Ir.  We  now  define  the  subring  KRm  of  P  by  KRm  :=  I'  \  Jr.  In 
addition,  we  define  (pR  :=  tR  ■  p>  and  :=  lK^m  ■  =  ^KR,m  •  for  A  <  m  <  A  +  // 

and  (f  R,;m  :=  0  if  m  <  I  or  m  >  I  +  //.  Note  that  is  the  restriction  on  P  of  a  polynomial 
of  degree  <  k  and  p>R^m  is  the  restriction  of  the  same  polynomial  on  KR.m  C  P.  Denote 

JCm  :=  {R  G  IZ  :  KR.m  #  0}.  It  is  easily  seen  that  if  I  C  Jo,  I  G  PTO  (m  G  Z),  and  is  not  a 

polynomial  on  /,  then 

'=  u  u  Kr.w  (disjoint  sets),  (3.4) 

Ren,  Rc i  ROCm ,  RnijtQ 

where  the  union  on  the  right  contains  exactly  one  subring  or  none. 

We  need  to  estimate  I)r  for  every  I  G  P.  There  are  two  possibilities  for  /  G  P: 

(i)  If  /  (i  Jo  =  0  or  /  C  To  but  /  C  P  for  some  P  G  P,  then  is  a  polynomial  of  degree 
<  A:  on  I  and  hence  J)T  =  0. 
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(ii)  If  if  is  not  a  polynomial  on  I  and  I  G  Vm  ( m  G  Z),  then  we  have,  using  (3.4), 

OO 

U)k(f>JYT<c\\V\\TLT{I)<C  Y,  IMIr  +  C  WV^rnWl, 

v=m+i  Renv,Rci  ReiCm^Rni^d) 

where  the  second  sum  contains  one  element  or  none.  We  use  this  estimate  to  obtain 

w\rs?m  ■■=  E2 

m&  Id'Pm 

OO 

<c  E2“mT  E  E  imj+cE2"”'  E  =:  s, + e2, 

m&  u=m+ 1  RETZV  me  Z  REfCm 

Applying  inequality  (2.12)  to  the  first  sum  above,  we  find 

e.<«£2“”'  E  m;<cEm;- 

ttt-GZ  R^^Rm  R£R- 

where  we  used  that  ||^r||t  <  |_R|1/T-1Aj|(^ii||p  <  2~mn\\LpR\\p,  R  G  7 Zm,  by  Holder’s  inequality. 
We  shall  estimate  E2  using  the  following  inequality: 

'52\\<pR,m\\l<c\\(pjl\\l,  Ren.  (3.5) 

mez 

To  prove  this  inequality,  suppose  that  R  =  IR\  Jr  with  IR  G  Vi  and  JR  G  Ve+,J:.  Using- 
Lemma  2.1,  we  obtain 

I \<PRMi\\p  <  \KrM\1/p\Wr\\oo  <  c\Kr/+j\^\R\-^r\\p  <  c‘2  V»\  f,.\  0  <j<  //, 

which  implies  (3.5). 

As  above,  by  Holder’s  inequality,  ||(/jBiTO||t  <  2~ma\\ipRjtn\\p.  This  and  (3.5)  imply 

s2  <  c^2  Y1  y^mWi  y^mWi  < cYl  \m\i^ 

?7i£Z  R^JCrn  R£R  R^R- 

where  we  switched  the  order  of  summation.  From  the  above  estimates  for  Ei  and  S2,  we  get 

M <  <=E  IMG  <  <=(E  \M\rPV,Tm)'-T/p  <  =  cn-iMi;, 

R£R  R£R 

where  we  used  Holder’s  inequality  and  that  Iq  is  a  disjoint  union  of  all  Re  R.  □ 

We  define  the  approximation  space  AJ  :=  A^(Lp,V)  as  the  set  of  all  functions  /  G 
L p(V,  k )  such  that 

II/II.4J  :=  ||/|lr  +  (f>V;(/,n>)«l)1/S  <  °o  (3.6) 

n—  1 

with  the  tq-norm  replaced  by  the  sup-norm  if  q  —  oo  as  usual. 
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We  now  recall  some  basic  definitions  from  the  real  interpolation  method.  We  refer  the 
reader  to  [1]  as  a  general  reference  for  interpolation  theory.  Suppose  X  and  B  are  two 
quasi-normed  spaces  and  B  C  X.  The  it-functional  is  defined  by 

:  =  K(f,  t]  X,  B)  :=  inf  (|| /  -  g\\x  +  t\\g\\B),  t  >  0. 

geB 

The  real  interpolation  space  (X,B)\,q  with  0  <  A  <  1  and  0  <  q  <  oo  is  defined  as  the  set 
of  all  /  G  X  such  that 

a  00  dt\  1/<i 

{t~hK(f,t))qj)  <  oo, 

where  the  Z^-norm  is  replaced  by  the  sup-norm  if  q  =  oo. 

The  Jackson  and  Bernstein  inequalities  from  Theorem  3.1  and  Theorem  3.2  yield  (see 
[6],  [20])  the  following  characterization  of  the  approximation  spaces  A/\ 

Theorem  3.3.  We  have,  for  0  <  7  <  a  and  0  <  q  <  00, 


with  equivalent  norms. 

We  next  show  that  in  one  specific  case  the  interpolation  space  as  well  as  the  corresponding 
approximation  space  can  be  identified  as  a  B-space.  The  analogue  of  this  result  for  Besov 
spaces  is  well  known  (see  [8]). 

Theorem  3.4.  Suppose  V  is  a  partition  ofMf,  k  >  1,  1  <  p  <  00,  and  1/r  :=  a  +  1/p.  Let 
0  <  a  <  j3  and  1/A  :=  f3  +  1/p.  We  have 

(l hp{V,h),Bik{V))^T=Bf[V)  =  A«(Lr,V) 

with  equivalent  norms. 

This  theorem  can  be  proved  by  using  the  machinery  of  interpolation  spaces  (see  [8]).  Here 
we  take  another  route  by  employing  the  approximation  from  piecewise  polynomials  directly. 
This  approach  will  enable  us  to  reveal  more  deeply  the  intricacies  of  nonlinear  piecewise 
polynomial  approximation.  In  order  to  streamline  the  presentation  of  our  results,  we  give 
the  proof  of  this  theorem  in  the  Appendix. 

•  Approximation  scheme  for  nonlinear  piecewise  polynomial  approximation.  We 

assume  that  /  G  Lp(Rd),  0  <  p  <  00,  and  V  is  an  arbitrary  dyadic  partition  of  M.d.  The  proof 
of  Theorem  3.1  suggests  the  following  approximation  procedure: 

Step  1.  Use  the  local  polynomial  approximation  to  represent  /  as  follows: 

/=Eu/.f,)=Euft 

me  z  iev 

where  </,„(/)  =  1/  •  tm^(f,V)  if  /  G  Vm  and  T)  <  p  (see  Theorem  3.1). 
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Step  2.  Order  {\\ti,n{f)\\P}ier  in  a  nonincreasing  sequence  ||ij1,„(/)||p  >  \\tfy,n{f)\\P  >  •  •  • 
and  then  define  the  algorithm  by 

n 

An{f,'P)p  :=  tljtfif)- 

3= 1 

By  Theorem  3.1  and  its  proof,  it  follows  that 

11/  -  A.(/yp  <  cn~a\\f\\Bak(r)l  for  /  G  Bf{V). 

Using  this  result,  one  can  show  that  An{f,V)p  achieves  the  rate  of  the  best  n-terrn  piecewise 
polynomial  approximation  generated  by  V. 

•  Nonlinear  approximation  from  the  library  We  denote 

°n(f)p  ■=  inf  (Tn{f,V)p, 

where  the  infimum  is  taken  over  all  dyadic  partitions  V.  The  following  theorem  is  immediate 
from  the  Jackson  estimate  in  Theorem  3.1: 

Theorem  3.5.  If  i nf'p  ||/||^a*(p)  <  oo,  then 

°n(f)P  <  cmTa  inf  \\f  \\B<*k{v) 

with  c  =  c(a,  k,p,  cl). 

In  §5,  we  shall  show  that,  in  a  natural  discrete  setting,  there  exists  an  effective  algorithm 
for  finding  a  partition  V*  which  minimizes  Bfk(V)  over  all  dyadic  partitions  V . 

It  is  an  open  problem  to  characterize  the  approximation  spaces  generated  by  {crn(f)p}. 

•  Remarks.  There  exists  another  technique  that  can  be  employed  for  the  proof  of  Theo¬ 
rem  3.1.  This  method  is  called  “splitting  and  merging”  and  has  been  introduced  in  [4]  and 
used  for  nonlinear  approximation  of  functions  from  the  space  B\r(R2).  It  was  further  used 
in  [11],  Also,  the  modulus  W(f,t)a,P,  used  in  [11]  which  is  a  generalization  of  a  characteristic 
from  [16]  {cl  =  1),  can  be  generalized  and  utilized  for  anisotropic  partitions  V. 

4  Relation  between  n-term  rational  and  piecewise  poly¬ 
nomial  approximation 

•  n-term  rational  functions.  We  denote  by  7 Zn  the  set  of  all  n-term  rational  functions 
on  of  the  form 

n 

R  =  f-  r 

3= 1 

where  each  r;  is  of  the  form 
d  ci  x  +  b 

r(x)  =  ]^[  a^lk  *  2,  ak,  bk,  ak,  j3k  G  M,  0k  A  0,  x  :  (aq, . . . ,  xd)  G  Rd .  (4.1) 

(xk  ak)  +  / Jk 
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Evidently,  every  R  E  7Zn  depends  on  <  4dn  parameters  and  7 Zn  is  nonlinear.  We  denote  by 
Rn(f)p  the  error  of  Lp-approxi mati on  to  /  from  7 Zn: 

Ml),  ■=  ti  ~ 

/Vft, 

Our  first  goal  is  to  show  that  the  rate  of  n-term  rational  approximation  in  Lp  (0  <  p  <  oo) 
is  not  worse  than  the  one  of  nonlinear  n-term  approximation  from  piecewise  polynomials  over 
nested  box  partitions  of  M.d. 

•  Piecewise  polynomials  over  almost  nested  families  of  boxes.  We  denote  by  J 
the  set  of  all  semi-open  boxes  /  in  M.d  (not  necessarily  dyadic)  with  sides  parallel  to  the 
coordinate  axes  (I  =  T\  x  ...  x  Xd). 

Suppose  En  C  J ,  n  =  0, 1, . . is  a  sequence  of  sets  of  boxes  which  satisfy  the  following: 

(i) #~n<2«. 

(ii)  For  each  n  >  1  there  exists  a  set  Qn  consisting  of  disjoint  boxes  from  J  such  that 

(a)  \J{I  :  I  e  f-U  =  \J{I  :  /  e  En  U  S^}, 

(b)  for  each  I  E  On  and  J  E  —n  U  “ n_  |  either  I  C  J  or  I  n  J  =  0,  and 

(c)  #fin  <  c\2n . 

Thus  Qn  is  a  set  of  “small”  disjoint  boxes  which  cover  the  boxes  from  Sn  U  ^n-i-  Now, 
we  denote  by  Sk( En)  the  set  of  all  piecewise  polynomials  of  degree  <  k  on  the  boxes  from 
En,  i.e.,  (j)  E  Sk( 3n)  if  <p  =  Xl/es,,  '  -Pri -Pr  £  hi/..  We  denote  by  S2 n{f)p  the  error  of  Lp 
approximation  to  f  E  Lp(Rd )  from  Sk( Hn),  i.e., 

S^.(/)|,:=S*.(/1S),:=  inf  \\f  —  (f>\\p. 

(f)(zS  \^n) 

•  Main  results.  Our  primary  goal  in  this  section  is  to  prove  the  following  theorem  that 
relates  the  n-term  rational  approximation  to  the  above  described  piecewise  polynomial  ap¬ 
proximation: 

Theorem  4.1.  Let  f  E  Lp(Rd),  0  <  p  <  00,  a  >  0,  and  k  >  1.  Then 

RvUX  <  c2->"  (^[2”'S|.(/)P]»‘  +  ll/ll' jV',  ,<  :=  min{p,  1},  (4,2) 

12  =  0 

with  c  =  c(p ,  A:,  ex,  d.  ti),  ivhere  c\  is  from  the  properties  of  {E„}. 

We  now  apply  the  result  from  Theorem  4.1  to  the  more  particular  situation  of  nonlinear  n- 
term  piecewise  polynomial  approximation  associated  with  any  dyadic  partition  V .  developed 
in  §3. 

Theorem  4.2.  Suppose  f  E  Lp(Rd),  0  <  p  <  oo,  a  >  0,  k  >  1,  and  V  is  any  anisotropic 
dyadic  partition  of  .  Then 

Rn{f)p  <  cn~a(j2  ^[ma(jkm{f\V)PY  +  \\f\\'f)  1  ,  h  ■■=  min {p,  1},  (4.3) 

m=  1 

where  c  =  c(p ,  A:,  cx,  cl). 
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Corollary  4.3.  Suppose  inf-p  ||/||i?“fc(-p)  <  00  with  a  >  0,  k  >  1,  and  1/r  :=  a  -f-  1/p, 
0  <  p  <  00,  where  the  infimum  is  taken  over  all  dyadic  partitions  V  ofM.d.  Then 

Rn{f)p  <  cn~a  inf  \\f\\Bak(T)% 

where  c  =  c(a,p ,  A:,  cl). 

•  Proof  of  the  main  results.  For  the  proof  of  Theorem  4.1,  we  shall  utilize  some  ideas 
from  [15]  and  [18].  We  let  S*{J)  denote  the  set  of  all  piecewise  polynomials  of  degree  k 
on  n  disjoint  boxes  in  M.d ,  i.e. ,  G  S^fJ)  if  ip  =  ^ ~2IeAn  1/  •  Pi,  where  An  is  any  collection 
of  n  disjoint  boxes  from  J  and  Pj  G  I1&.  The  approximation  will  take  place  in  LpfRd), 
0  <  p  <  00. 

Theorem  4.4.  For  each  p>  G  S/n iJ),  m  >  1,  and  n  >  1.  there  exists  R  G  Rn  such  that 

\\T  ~  R\\P  <  C2  1  exP  (-c2(n/m)1/2d)  \\v\\p,  (4.4) 

ivhere  c2  =  c2(p,  <1.  k.  Ci )  >  0. 

D.  Newman  [13]  proved  the  remarkable  result  that  the  uniform  nth  degree  rational  ap¬ 
proximation  of  \x\  on  [—1, 1]  is  of  order  0(n_c'/”).  The  following  lemma  rests  on  Newman’s 
construction. 


Lemma  4.5.  For  each  7  >  0,  0  <  S  <  1,  and  v  a  positive  integer,  there  exists  a  univariate 
rational  function  a  such  that  deg  a  <  cln(e  +  1/5)  ln(e  +  I/7)  +  4n  and 

0  <  1  —  aft)  <  7,  if  |t|  <  1  —  5, 

0  <  aft)  <  7  •  ^  \*\  -  ^ 

0  <  aft)  <1,  t  G  (—00,00), 


where  c  is  cm  absolute  constant.  Moreover,  a  has  only  simple  poles  and,  evidently,  if  a  = 
P/Q,  then  deg  P  <  degQ. 

Proof.  It  follows  by  Lemma  8.3  of  [20]  (see  also  [18])  that  there  exists  a  rational  function 
a  which  satisfies  all  the  conditions  of  Lemma  4.5  eventually  except  for  the  last  one  (simple 
poles).  Evidently,  adding  a  suitable  sufficiently  small  constant  to  the  denominator  of  a  in  its 
representation  as  a  quotient  of  two  polynomials  will  ensure  the  last  condition  of  the  lemma 
without  destroying  the  other  conditions.  □ 

For  the  proof  of  Theorem  4.4,  we  shall  use  the  Fefferman-Stein  vector  valued  maximal 
inequality  (see  [10]  or  [21]):  If  0  <  p  <  00,  0  <  q  <  00,  and  0  <  s  <  min {p,q},  then  for  any 
sequence  of  functions  /1,  /2, . . .  on  M.d 


OO  OO 

ll(£  [(A<./i)(-)],)1/,llr  <  e|l(£  IA(')l,)1/,lt.  (4.5) 

j= 1  j~ 1 

where  c  =  cfp ,  q ,  s,  cl)  and 

(Msf)(x)  :=  sup  (^-  [\f{y)\scly]  ,  xeRd. 

u  a,  1  x  \  1  Ji  J 
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Lemma  4.6.  Suppose  tp  :=  1/  •  P  with  I  G  J  and  P  G  11*.,  and  let  A,  s  >  0.  Then  there 
exists  a  rational  function  R  G  7Zi  with  l  <  cln2d(e  +  1/A)  such  that 

\\T  ~  ^||p  <  cX\\ip\\ p 


and 

\R(x)\  <  cX\I\~1^py\\p(Msti)(x)1  x  G  Rd\R 

where  c  =  c(k,p ,  s,  cl). 

Proof.  It  is  easily  seen  that 

d 

{Msli){x)  =  Y[{Mstii){xi),  I  =  x  ...  x  ld  (4.6) 

2=1 

(product  of  univariate  maximal  functions). 

We  shall  prove  the  lemma  in  the  case  when  I  =  Q  :=  [—1,  l)d.  The  general  case  follows 
by  change  of  variables.  Let  0  <  A  <  1  (the  case  A  >  1  is  obvious).  Since  P  G  lb;,  then  all 
norms  of  P  are  equivalent  and  this  yields 

|PW|<c||»,||pnt1(l  +  k:|)‘,  x  e  *^{1(3},  (4.7) 

where  c  =  c(p,  A:,  d)  and  :=  [— |,  \)d. 

Let  a  be  the  univariate  rational  function  from  Lemma  4.5,  applied  with  7  :=  A,  S  :  = 
min{Ap,  1/2},  and  v  :=  [i(A:  +  1/s)]  +  1.  We  define  R  :=  kP  with  k(x)  :=  Yl<!-icr(xi)- 
By  Lemma  4.5, 

deg  <7  <  cln(e  +  1/XP)  ln(e  +  1/A)  +  Av  <  cln2(e  +  1/A),  c  =  c(k,p ,  s), 

and  a  has  only  simple  poles.  Therefore,  R  G  7Zi  with  i  <  cln2d(e  +  1/A).  Obviously 
0  <  k(x)  <  1,  x  G  Rd.  It  is  readily  seen  that 


2=1 


0  <  1  —  k(x)  <  ^^(1  —  a(Xj ))  <  cl\  for  x  G  Q$  :=  [—1  +  5, 1  —  <5]d. 

Therefore, 
and,  using  (4.7), 


“  r\\lp(qs)  -  ||-P(1  -  k)I|lp(Q7  < 


p- 


y  -  R\\lp(q\qs)  <  cy\\p\Q\Qs\1/p  <  c5l/py\\p  <  cxyy 

Finally,  by  (4.6)  and  (4.7),  we  find,  for  x  G 


W*)l  < 
< 


d  /  .  \  4 v—k 

cAM^n(lTR) 

d 

cAu^i^n^-1^1])^) = cx\if 
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where  we  used  that  Au  —  k  >1  / s  and  hence 


!-H.:  !!/! 


2  \l/s  /  1  \  * 
2  +  t  J  \  1  +  a  J 


1 1  >  l. 


□ 


Proof  of  Theorem  4.4.  Suppose  p  G  S^{J)  ( m  <  n)  and  p  =:  ^J£A  1/  •  Pj,  Am  C  J . 

Let  A  :=  exp  (n/m)1,/2dj  and  s  :=  |min{p,  1}.  We  apply  Lemma  4.6  to  each  function 

pi  :=  1 1  •  Pi  to  conclude  that  there  exist  rational  functions  Ri  G  Pi  with  i<c  ln2d(e  +  l/A) 
such  that 

\\pl  -A?/||p  A  cXWPtWp 


and 

\Ri(x)\  <  v\\p;\„l  l>(.V[,il/)(.r).  z  G  Rd\I. 

We  define  R  :  =  ^J£A  1?/.  Obviously,  1?  G  7Tmi  C  Pen-  We  have 


ib  -  ii *,  -  r, \\rLAI)fir + caii  ^  ur^w.uM, !,)(■% 

i  i 

<  cmY,  \\fl\\l)l/r  +  all  £  \\MAi\-1,rh(-)\\r  <  cAII^IU. 

I  I 


where  we  used  (4.5)  with  q  =  1  and  s  =  |min{p,  1}  <  min{p,  1}.  Theorem  4.4  follows. 

□ 

Proof  of  Theorem  4.1.  Case  I:  p  >  1.  Evidently,  there  exists  G  Sk(El v)  such  that 
11/  —  4>A\p  =  §2 v(f)p-  We  define  pv  :=  (pv  —  v  >  1,  and  v?o  :=  </o-  Then  we  have,  for 

>  lj 


ll^llp  <  11/  -  <M|p  +  11/  -  i|Ip  =  §2 ’'{f)p  +  §2-1  (/)p  and  ll^ollp  <  §i(/)P  +  ||/||p- 

From  the  properties  of  {£,■},  there  exists  a  set  of  disjoint  boxes  Q;,  C  J'  such  that  rnv  :  = 
<  Ci 2^  and  99^  G  <Sfc(f2„). 

We  fix  j  >  0.  Now,  for  each  v  =  0, 1, . . . ,  j,  we  apply  Theorem  4.4  with  p  :=  m  :=  mv 
(from  above),  and  n  :=  iV„  :=  [A2^(o(j  —  Z2))2d]  +  1,  where  A  :=  ci(ln 2/c2)M,  c2  is  from 
Theorem  4.4.  We  obtain  that  there  exist  Rv  G  Pn„  such  that,  for  v  >  1, 

,Y„  A 1/M 


j  ll^llp  <  c2"“^(S2*'(/)p  +  §2-i(/)p)  (4.8) 


—  R,, 


<  c2  1  exp 


-c2 


and 

Iko  -  i?o||p  <  c2-^\\p0\\p  <  c2_"J,(S1(/)p  +  ||/||p).  (4.9) 

We  define  /?  :=  Yhi= 1  Rv  Obviously,  R  G  7\Ly  with 
i  j 

N  =  y^jNv  =  y^(Aa2<i2J (j  -  za)m  +  1)  <  c32j,  c3  =  c3(p,  A:,  cl ,  a,  C|). 

v=i  ^=1 
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From  (4.8)  and  (4.9),  we  find 


j  3 

11/  -  R«h  <  11/  -  <h\\,  +  V  ||^„  -  K ||p  <  +  ll/ll,). 

v=0  v=0 

Estimate  (4.2)  follows  from  above  by  a  suitable  selection  of  j  (depending  on  n). 

Case  II:  0  <  p  <  1.  The  proof  is  similar  to  the  one  from  Case  I.  The  only  difference  is 
that,  in  this  case,  one  should  use  the  p-\  rianglo  inequality  (||  Yh9j\\p  —  <  p  <  i) 

instead  of  Minkovski’s  inequality.  □ 

Proof  of  Theorem  4.2.  We  may  assume  that  G  are  such  that  ||/  —  ipv\\p  = 

&2-'{f,'P)p,  13  =  0,1,....  Suppose  =:  a,  ^  '  Pl'  wlicn:  e  H k,  Av  C  V,  and 

# Av  <  2".  From  the  proofs  of  Theorem  3.2  and  Theorem  3.4,  it  follows  that  the  sequence 
{# Av}  satisfies  conditions  (i)-(ii)  of  {E;y}  and,  therefore,  (4.2)  holds  with  S kAf)p  replaced 
by  a/k  ( V )  which  implies  (4.3).  □ 

Proof  of  Corollary  4.3.  This  corollary  follows  immediately  by  Theorem  3.1  and  Theo¬ 
rem  4.2.  □ 

•  Sharpness  of  the  results.  It  is  rather  easy  to  see  that  the  estimates  of  this  section  are 
sharp  with  respect  to  the  rate  of  approximation.  For  a  given  n  >  1,  consider  the  function 

d 

fn{x )  :=  (]^[ sin 7nr„)  •  l[0l4n]x[d,i]<'-i (x),  x  :=  (a?i, . . .  ,xd)  G  Rd . 
u~l 

Since  sin7r;ri  oscillates  4n  times  on  [0,4n]  and  every  n-term  rational  function  can  oscillate 
<  2 n  times  on  any  straight  line  parallel  to  the  aq-axes  (has  no  more  than  2 n  —  1  zeros), 
then  Rn{fn)p  >  c||/n||p  >  cn1//p,  0  <  p  <  oo.  On  the  other  hand,  evidently,  if  a  >  0  and 
1/r  =  a+  1/p,  then  ||/n||Bda,*^L  ^  <  cn1//r,  where  B^a,k(LT)  is  the  Besov  space  defined  in 
(2.26).  Therefore,  sup^n  da  k  <x  Rn(f)p  >  cn~a  and  hence  the  estimate  from  Corollary  4.3 

Bt  ’  (Lt)~ 

is  sharp,  and  similarly  for  the  other  estimates. 

5  Nonlinear  n-term  approximation  from  the  library  of 
anisotropic  Haar  bases  and  best  basis  selection 

An  anisotropic  Haar  basis  is  naturally  associated  with  each  anisotropic  dyadic  partition  V  of 
a  box  Q  in  (or  Md).  For  the  sake  of  simplicity,  we  shall  consider  Haar  bases  only  on  a  box 
with  sides  parallel  to  the  coordinate  axes  and  |fl|  =  1.  Then  V  =  \J^=0Vm-  Let  I  G  V 
and  /  =:  A  x  . . .  x  I(j.  Suppose  /  is  split  (in  V)  by  dividing  in  half  the  zffh  (1  <  v  <  cl) 
side  of  I.  Then  we  define  Hj  \=  lXl  x  . . .  x  Hzv  x  ...  x  tjd ,  where  Hlv  is  the  univariate 
Haar  function  supported  on  Xv  and  normalized  in  L^.  In  other  words,  if  I  G  V  and  Ji,  J2 
are  the  two  children  of  /  in  V  (properly  ordered),  then  Hj  ■—  Iji  —  1j2-  We  need  to  add  the 
characteristic  function  of  O  to  the  collection  of  the  above  defined  Haar  functions.  To  this 
end  we  denote  1°  :=  Iq  :=  Q  and  include  both  1°  and  Iq  in  Vo  and  V.  So,  there  are  two 
copies  of  Q  in  V.  We  define  if 70  :=  l/o  and  V°  :=  V  \  {/°}. 
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Thus  Hp  :=  {Hi  :  I  G  V)  is  the  Haar  basis  associated  with  V.  We  let  H  :=  {Hp}p 
denote  the  collection  (library)  of  all  anisotropic  Haar  bases  on  Q. 

Clearly,  the  following  is  valid  for  a  fixed  partition  V:  (i)  H'p  is  an  orthogonal  system  in 
L2(0)  and  it  is  an  orthogonal  basis  for  Lq(V)  :=  1).  (ii)  The  linear  space  Sf  of  all 

piecewise  constants  over  the  boxes  from  Vn  (see  §2)  is  spanned  by  {Hj  :  I  G  U”=o  Vu } . 

Other  anisotropic  Haar  bases  which  involve  products  of  Haar  functions  can  easily  be 
constructed,  too.  We  do  not  consider  such  constructions  in  this  article  since  it  does  not 
change  the  essence  of  the  problems. 

•  Hp  is  a  basis  for  Lp('P)  and 

Theorem  5.1.  For  each  dyadic  partition  V  of  Q  the  Haar  basis  Hp  is  an  unconditional 
basis  for  L P(V),  1  <  p  <  oo. 

Proof.  The  proof  can  be  carried  out  exactly  as  the  proof  in  the  case  of  the  univariate  Haar 
system  due  to  Burkholder  (see  [24])  and  we  shall  skip  it.  □ 

Throughout  the  rest  of  this  section,  we  shall  assume  that  1  <  p  <  oo,  a  >  0,  1/r  :  = 
a  +  1/p,  and  V  is  an  arbitrary  dyadic  partition  of  il.  We  naturally  have  (see  (2.20)-(2.21)) 

ii/iIb^-p!  :=  ii/Umo) + (£  i /rv-i(/,/);)1/T. 

iev° 

We  next  characterize  the  B-norm  of  function  in  Bf'l(V)  by  means  of  its  Haar  coefficients 
using  Hp. 

Theorem  5.2.  Every  f  G  Bf,]  (V)  can  be  represented  uniquely  in  the  form 

f  =  YMf)H!  a-e ■  on  ^  with  C/(/)  :=  \I\~l  jT  fHr,  (5.1) 

lev 

where  the  series  converging  absolutely  a.e.  and  unconditionaly  in  Lp.  Moreover, 

ii/Hwot  «  w.Hr)  :=  (Yim-°T\\c,(f)H,rr)irt 

lev 

=  (Ei^i-“r+1i^(/)r)1/T  =  (EiN(/)^iPI/T  <5-2> 

lev  lev 

with  constants  of  equivalence  depending  only  on  p,  a,  and  cl. 

Proof.  Let  /  G  Bf,  Bf  :=  Bf^ifP).  Bv  Theorems  2. 7-2. 8,  /  G  Lp(Q)  and  hence,  using 
Theorem  5.1,  /  has  a  unic^ue  representation  in  the  form  (5.1).  We  shall  next  prove  that 

W,^)<c||/||B?.  (5.3) 

Case  I:  r  >  1.  This  case  is  trivial  because  we  obviously  have 

M/)l  =  I  J  f\  <  ||/||,  and  \c,U)\  =  \I\-'\[fH,\  <  |/|-^wt( /,/)„  if///0, 

which,  in  view  of  (5.2),  imply  (5.3). 
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Case  II:  0  <  r  <  1.  Clearly, 

\I°\-aT\\cAf)Hl0\\l  <  \\f\\Ll(n)  <  \\f\\Lp(a)  (|/°|  =  1). 

By  Theorem  2.7  with  r)  =  r  and  k  =  1,  /  can  be  represented  in  the  form 

OO  OO 

/  =  T0  +  ^  tj  =  T0  +  ^  ^  2/  a.e.  on  12 

j=i  i=i  /e-Pj 

with  the  series  converging  absolutely  a.e.,  where  tj  :=  ■=  Tj  ~  Tj_ i,  Tj  :=  Tj>r(f,V), 

and  2/  :=  1/  •  t,j  if  I  G  Tj. 

Fix  /  G  'Tm  {m  >  0),  I  ^  7°.  Evidently,  ||c/(/)77/||1  =  |cj(/)||J|  <  ||/  -  c||Ll(/)  for  every 
constant  c.  Therefore, 


OO 

O  (/)/// 1  1  <  |  /  1  7':,:/)  <  ^  I  f  j\  /  . ( / ; * 

j  =m+ 1 


which  readily  implies 

OO 

i/r*iM/)ff/ii;  =  i/r"T+i-ic, u)h, in  <  i/r^(  ^  iit3iii,(/))T 

j=m+ 1 

OO  OO 

<  i irr  E  E  ii^iii  <  E  E  (i-t/icriMi;, 

j=m+ 1  JEVj,JCl  j=m+ 1  J£Vj,JCl 

with  7  :=  o;  —  1/r  +  1  =  1  —  1/p  >  0,  where  we  used  that  r  <  1.  We  now  proceed  similarly 
as  in  the  proof  of  Theorem  2.6  (see  the  Appendix).  We  substitute  the  above  estimates  in 
the  definition  of  W(/,  Tv)  in  (5-2)  and  switch  the  order  of  summation  to  obtain  (5.3). 

In  the  other  direction,  the  Haar  basis  Tv  obviously  satisfies  the  conditions  of  Theorem  2.5 

o  ( \  rptipp 

<  cm.nv.  (5.4) 

lev 

On  the  other  hand,  by  Theorem  5.1,  Tv  is  an  unconditional  basis  for  L P(V).  Therefore, 

/  =  YMfW!  a.e.  on  12 
lev 

with  the  series  converging  absolutely  a.e.  and  unconditionally  in  Lp.  Using  (5.4),  we  infer 
ll/ll,  <  cj\f (f.Tv)-  We  utilize  the  above  representation  of  /  to  obtain 

OO  OO 

s'jf)r <  ii/ -  E  c/W/iu  < (Eh Ec4wui;)1/a  =  (E<E iic/jf/iu)A/T)1/A 

|/|>2— m  j=m  IeVj  j=m  IeVj 

with  A  :=  min{r,  1}.  Now,  exactly  as  in  the  proof  of  Theorem  2.6  (see  the  Appendix),  we 
use  this  in  (2.22)  and  switch  the  order  of  summation  to  obtain  ||/||g«  <  Tv)-  This 

completes  the  proof  of  the  theorem.  □ 
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•  n-term  approximation  from  a  single  basis  "H-p.  For  a  given  partition  V,  we  denote 
by  T,n(V)  the  set  of  all  functions  ip  of  the  form 

<p  =  "/Hi- 

ie  An 

where  An  C  V  and  #An  <  n.  The  error  an(f,  Jip)p  of  nonlinear  n-term  Lp-approxi mat i on 
to  /  from  Tip  is  defined  by 


Vn(f,'Hv)p-=  inf  ||/ -  t\\lp(q)- 

(pE^ni'P) 

Clearly,  Sn('P)  C  S2n( V)  and  hence  cr2n(/, T)p  <  Sn{f ^r)p-  The  approximation  spaces 
A ^  :=  A1q(Lp,'H'P )  generated  by  the  n-term  approximation  from  Tip  are  defined  similarly 
as  the  approximation  spaces  A)  (see  (3.6)).  The  problem  again  is  to  characterize  the  ap¬ 
proximation  spaces  AJ  which  reduces  to  establishing  Jackson  and  Bernstein  inequalities  and 
interpolation. 

Theorem  5.3.  Suppose  V  is  an  arbitrary  partition  of  Q  and  let  1  <  p  <  oo,  a  >  0;  and 
1/r  :=  a  +  1/p.  Then  the  following  Jackson  and  Bernstein  inequalities  hold: 

Sn(f,nv)p  <  cn  a\f\ir:wrr  /eB^n  (5.5) 

WtWbPJt)  T  cna\\p\\Lp{n),  p  e  En(p),  c  =  c(a,p,  d).  (5.6) 

Therefore,  for  0  <  7  <  a  and  0  <  q  <  00, 

A](LP,HV)  =  W,(V),Bp(V)).l/p,,  =  A](LP,HV)  (5,7) 

with  equivalent  norms  (see  Theorem  3.3). 

Proof.  The  Jackson  estimate  (5.5)  can  be  proved,  using  Theorem  5.2,  exactly  as  Theorem  3.1 
was  proved.  The  Bernstein  inequality  (5.6)  follows  by  Theorem  3.2.  An  easier  proof  can  be 
given  by  using  that  'H'p  is  an  unconditional  basis  for  Lp  (1  <  p  <  oo).  The  characterization 
of  A/  in  (5.7)  follows  by  (5.5)  and  (5.6)  (see  [6],  [20]).  □ 

•  “Algorithm”  for  n-term  approximation  from  TL-p.  We  note  that  a  near  best  n-term 
Lp-approxi matioii  from  Ti-p  (1  <  p  <  oo)  to  a  given  function  /  €  ^P{V)  can  be  achieved 
by  simply  retaining  the  biggest  (in  Lp )  n  terms  from  the  representation  of  the  function 
/  in  Jlp  (see  [23]).  This  result  suggests  the  following  “threshold  algorithm”  for  n-term 
Tp-approxi mat i on  from  Jip  (1  <  p  <  oo): 

Step  1.  Find  the  Haar  decomposition  of  /  in  Tip:  f  =  J2IeV  Ci(f)Hj. 

Step  2.  Order  the  terms  of  {\\ci{f)HI\\p}Iep  in  a  nonincreasing  sequence  \\cIl(f)HIl\\p  > 
\\ci2{f)Hl2\\p  >  •  •  •  and  then  define  the  algorithm  by 

n 

Ms,v)v-.=  Y.c’,a)Hh. 

3= 1 
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From  the  above  observation,  An{f,V)p  provides  a  near  best  n-term  Lp-approximation  to  / 
from  piecewise  constants  generated  by  V. 

•  n-term  approximation  from  the  library  H  :=  {'H'p}-  We  denote  by  &n{f)p  the  error 
of  n-term  approximation  of  /  G  Lp  from  the  best  basis  in  H,  i.e., 

&n{.f)p  ■  ^nt/i  'H'p)p- 

The  following  theorem  is  immediate  from  the  Jackson  estimate  (5.5): 

Theorem  5.4.  If  i nf'p  ||/||Ba,i^  <  oo,  then 

°n{f)P  <  cn~a  inf  ||/ \\b?a(t) 


with  c  =  c(p ,  a,  cl). 

Our  approximation  scheme  for  nonlinear  n-term  approximation  of  a  given  function  /  G 
Lp(Q)  from  the  library  H  :=  {Hp}  of  all  anisotropic  Haar  bases  consists  of  two  steps: 

(i)  Find  a  basis  %{/)  G  H  which  minimizes  the  Btf'1- norm  of  /. 

(ii)  Run  the  above  threshold  algorithm  for  near  best  n-term  approximation  from  B(f). 
The  most  significant  fact  in  this  part  is  that,  in  a  natural  discrete  setting,  there  is  an  effective 
algorithm  for  best  Haar  basis  selection,  which  we  present  below. 

The  above  approximation  scheme  requires  a  priori  information  about  the  smoothness 
a  >  0  of  the  function  /  (which  is  being  approximated)  with  respect  to  the  optimal  Bf'1- 
scale.  We  do  not  have  an  effective  solution  for  this  hard  problem.  Of  course,  one  can  get 
some  idea  about  the  optimal  smoothness  a  of  a  given  function  experimentally. 

•  Best  Haar  basis  or  best  B-space  selection.  We  next  describe  a  fast  algorithm  for 
best  anisotropic  Haar  basis  or  best  B-space  selection  in  the  discrete  case  of  dimension  d  =  2. 
This  algorithm  is  well  known  (see,  e.g.,  [9]  and  the  references  there  in).  Also,  this  algorithm 
is  somewhat  related  with  the  algorithm  for  best  basis  selection  from  wavelet  packets  (see 
[3]).  Both  algorithms  rest  on  one  and  the  same  principle. 

We  consider  the  set  Xn  of  all  functions  /  :  [0,  l)2  — >■  R  which  are  constants  on  each  of  the 
2"  x  2n  “pixels” 


I  =  [(*  -  1)2-",  *2-")  x  [(j  -  l)2-n,  j2~n),  1  <  i,j  < 


Denote  by  Vn  the  set  of  all  such  pixels  on  [0, 1  )2 .  We  let  Pn  denote  the  set  of  all  dyadic 
partitions  V  of  [0,  l)2  such  that  TTn  =  T>n  and  we  shall  consider  V  terminated  at  level  2 n. 
Thus  V  =  Um=o  ‘Prn-  Clearly,  Xn  =  Sf  (see  §2). 

Motivated  by  the  result  from  Theorem  5.4,  our  next  goal  is  to  find,  for  a  given  /  G  Xn . 
a  dyadic  partition  V *  :=  V*(f)  G  Pn  which  minimizes  the  B-norm  J\[(f,T)  from  (5.2). 
Evidently,  for  V  G  Pn,  B-p  is  an  orthogonal  basis  for  the  linear  space  Xn  and,  therefore, 


lev 


with  c/(/)  :=  \I\  1  J  fHp. 
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We  briefly  denote  d(I,V)  :=  |/|_Qr+1|c/(/)|r.  Also,  we  set  do(I)  :=  d(I,V)  if  I  is  subdivided, 
say,  horizontally  in  V,  and  d\(I)  :=  d(I,V )  if  I  is  subdivided  vertically  in  V.  Then  we  have, 
for  the  B-norm  from  (5.2), 


N{f,vy  =  =■■  DV>)- 

lev 

For  a  given  dyadic  box  J,  we  denote  by  P j  the  set  of  all  dyadic  partitions  Vj  of  J  which  are 
subpartitions  of  partitions  from  Pn.  Similarly  as  above,  we  set 

D{Vj):=  ^lUeVj). 

ievj 

We  next  describe  a  fast  algorithm  for  finding  a  partition  V*  G  Pn  which  minimizes  the 
B-norm  The  idea  of  this  construction  is  to  proceed  from  fine  to  coarse  levels 

minimizing  D(Vj)  for  every  dyadic  box  J  at  every  step.  More  precisely,  we  use  the  following- 
recursive  procedure.  We  first  consider  all  boxes  J  with  \J\  =  2-2n+1.  Each  box  J  like  this  is 
the  union  of  two  adjacent  pixels  and,  hence,  it  can  be  subdivided  in  exactly  one  way.  Thus 
V}  is  uniquely  determined.  Now,  suppose  that  we  have  already  found  all  partitions  V }  of 
all  dyadic  boxes  J  with  \J\  <  2~lx  (0  <  //,  <  2 n)  which  minimize  D(Vj)  over  all  partitions 
Vj  G  Pj.  Let  J  be  an  arbitrary  dyadic  box  such  that  \J\  =  2_/i+1.  There  are  two  cases  to 
be  considered. 

Case  I:  One  of  the  sides  of  J  is  of  length  2~n.  Then  there  is  only  one  way  to  subdivide 
J  and,  hence,  V }  and  min D(Vj)  =  D(V})  are  uniquely  determined. 

Case  II:  Both  sides  of  J  are  of  length  >  2~n .  Then  J  can  be  subdivided  in  two  possible 
ways:  horizontally  or  vertically  and,  therefore,  J  has  two  sets  of  children.  Let  us  denote  by 
J{  and  ./j  the  children  of  J  obtain  when  dividing  J  horizontally  and  ./(  and  J'2  the  children 
of  J  obtain  when  dividing  J  vertically.  The  key  observation  is  that 

min  D(Vj)  =  min  {D(V"J;)  +  D(Vj.)  +  <W),  D(V\  )  +  D(T\)  +  *(/)}. 

Therefore,  if  rniiip,  D(Vj )  is  attained  when  J  is  first  subdivided  horizontally,  then  V}  = 
Vjo  U  V}°  U  { J}  will  be  an  optimal  partition  of  J  and  V*j  =  T*j,  U  Vj,  U  { J}  will  be  optimal 
in  the  other  case.  We  process  like  this  every  dyadic  box  of  area  2_/i+1  and  this  completes  the 
recursive  procedure.  After  finitely  many  steps  we  find  a  partition  V*  of  Ct  which  minimizes 

d(V)  =  N(i,vy. 

Every  /  G  Xn  belongs  to  any  (discrete)  space  and  we  have,  by  Theorem  5.4, 

°m{f)P  <  cmr0  inf  ||/||B?,i(py?  m  =  1,2,.... 

r  fcfr  n 

Once  the  smoothness  parameter  a  >  0  is  fixed,  the  above  algorithm  provides  a  basis  which 
minimizes  the  B"-norm  of  /.  It  is  a  problem  to  find  the  optimal  smoothness  a  of  /. 

Several  remarks  are  in  order:  (i)  For  a  given  function  /  G  Xn .  the  number  of  all  coefficients 
C/(/ )  (or  Haar  functions  Hj )  that  participate  in  the  representations  of  /  in  all  anisotropic 
Haar  bases  is  <  2N.  where  N  :=  22n  is  the  number  of  the  pixels.  Moreover,  these  coefficients 
can  be  found  by  O(N)  operations. 
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(ii)  For  a  given  function  /  G  Xn  and  fixed  indices  a  and  r,  only  O(N)  operations  are 
needed  to  find  a  Haar  basis  'H(f)  which  minimizes  the  1  -norm  J\f(f,V). 

(iii)  Other  O(iVlniV)  operations  (mainly  for  ordering  the  coefficients)  are  needed  for 
finding  a  near  best  n-term  approximation  to  /  from  the  best  Haar  basis  'H(f). 

Evidently,  the  library  of  anisotropic  Haar  bases  with  the  above  threshold  algorithm  can 
be  used  for  image  compression.  Especially,  hybrid  methods  which  utilize  combinations  of 
good  biorthogonal  wavelets  and  the  library  of  Haar  bases  look  promising. 

The  above  idea  for  best  basis  selection  can  be  utilized  for  best  B-space  selection,  namely, 
for  the  selection  of  a  partition  V*  which  minimizes  the  B-norm  ||/[|pafr(p)  of  a  given  function 
/  when  k  >  1.  Indeed,  precisely  as  above  we  can  find  a  partition  V *  G  P„.  which  minimizes 
||/|Ib<u(-p)  or  an  equivalent  norm. 

6  Concluding  remarks  and  open  problems 

Our  results  from  §4  show  that  the  set  of  n-term  rational  functions  is  a  powerful  tool  for 
approximation.  The  n-term  rational  functions  that  we  consider,  however,  depend  on  the 
coordinate  system.  It  is  natural  to  consider  the  more  general  n-term  rational  functions  of 
the  form  R  =  O’  where  each  r7  is  of  the  form  r(Ax )  with  r  from  (4.1)  and  A  any  affine 
transform.  The  set  of  all  such  rational  functions  is  independent  of  the  coordinate  system. 
Here  we  do  not  consider  such  more  general  approximation  because  our  approximation  method 
is  limited  by  the  conditions  on  the  maximal  inequality  we  use  (see  §4).  We  believe  that  n- 
term  rational  approximation  should  be  considered  as  a  special  case  of  the  more  general 
n-term  approximation  from  the  collection  (dictionary)  of  all  functions  of  the  form  <. p(u\X\  + 
uj, . . . ,  UdXj  +  Vd)t  or  ip(Ax),  A  an  affine  transform,  where  ip  is  a  fixed  smooth  and  well 
localized  function  such  as  p(x)  :=  e-^  .  The  ultimate  goal  of  the  theory  of  n-term  rational 
approximation  (of  any  type)  is  to  characterize  the  corresponding  approximation  spaces.  This 
article  does  not  solve  that  problem  but  shows  that  the  smoothness  spaces  which  govern  n- 
term  rational  approximation  are  fairly  sophisticated  ones. 

We  now  turn  to  the  fundamental  question  in  nonlinear  approximation  (and  not  only 
there)  of  how  to  measure  the  smoothness  of  the  functions.  In  [17],  we  showed  that  all 
rates  of  nonlinear  spline  approximation  are  governed  by  the  scale  of  Besov  spaces  B®'k(Lr) 
(1/t  :=  a  +  1/p).  For  more  sophisticated  multivariate  nonlinear  approximation,  however, 
the  Besov  spaces  are  inappropriate.  This  is  clearly  the  case  when  the  approximation  tool 
contains  functions  supported  on  long  and  narrow  regions  or  have  elongated  level  curves  like 
the  piecewise  polynomials  and  rational  functions  considered  in  this  paper  (see  the  end  of 
§2).  It  is  crystal  clear  to  us  that  for  highly  nonlinear  approximation  such  as  multivariate 
piecewise  polynomial  approximation  there  does  not  exist  a  single  super  space  scale  (like 
the  Besov  spaces)  suitable  for  measuring  the  smoothness.  We  believe  that  in  many  cases 
the  smoothness  of  the  functions  should  be  measured  by  means  of  an  appropriate  collection 
of  space  scales  which  should  vary  with  the  approximation  process.  To  illustrate  this  idea 
we  return  to  the  piecewise  polynomial  approximation  considered  in  §3  and  §5.  For  this 
type  of  approximation,  a  function  /  should  naturally  be  considered  of  smoothness  a  >  0  if 
inf-p  ||/||_Bafc  (•p)  <  oo  which  means  that  there  exists  a  partition  V*  such  that  ||/||Ba*(p.)  <  oo. 
Then  the  rate  of  the  n-term  piecewise  polynomial  (of  degree  <  k )  approximation  to  /  is 
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0(n~a )  (roughly). 

Clearly,  in  nonlinear  piecewise  polynomial  or  rational  approximation  there  is  no  satu¬ 
ration,  which  means  that  the  corresponding  approximation  spaces  AJ  are  nontrivial  for  all 
7  >  0.  Therefore,  it  is  highly  desirable  that  the  smoothness  spaces  we  use  characterize  the 
approximation  spaces  A/  for  all  0  <  7  <  oo.  This  was  a  guiding  principle  to  us  in  designing 
the  B-spaces  in  this  article.  Notice  that  all  our  approximation  results  from  §3-§5  hold  for  each 
a  >  0.  To  make  this  point  more  transparent,  we  shall  next  briefly  compare  our  results  with 
existing  ones,  which  involve  Besov  spaces.  We  first  note  that  the  situation  in  the  univariate 
case  is  quite  unique,  since  the  scale  of  Besov  spaces  Bf,k(Lr)  (1/r  =  a  +  l/p)  governs  all  rates 
of  nonlinear  piecewise  polynomial  approximation  (see  [17]).  Therefore,  there  is  no  reason  for 
introducing  B-spaces  in  dimention  cl  =  1.  They  would  be  equivalent  to  the  corresponding 
univariate  Besov  spaces  and  hence  useless.  Besov  spaces  are  also  used  in  dimensions  cl  >  1 
(see  [5],  [7],  and  [11]),  but  they  are  not  the  right  smoothness  spaces  even  for  nonlinear  piece- 
wise  polynomial  approximation  generated  by  regular  partitions.  It  follows  by  the  discussion 
at  the  end  of  §2  (see  (2.28))  and  by  Theorems  3. 1-3.3  that  the  Besov  spaces  B/a,k(LT)  can 
do  the  job  when  0  <  a  <  1/p  and  they  fail  when  a  >  1/p.  Of  course,  this  range  for  a  is 
wider  when  approximating  from  smoother  piecewise  polynomials  (see  [5],  [7]).  In  a  nutshell, 
the  Besov  spaces  are  the  right  smoothness  spaces  for  characterization  of  nonlinear  piecewise 
polynomial  approximation  in  dimensions  cl  >  1  only  for  regular  partitions  and  for  a  limited 
range  of  approximation  rates,  and  they  are  completely  unsuitable  in  the  anisotropic  case. 

Another  important  element  of  our  concept  is  to  have,  together  with  the  library  of  spaces, 
a  companion  library  of  bases  which  are  (unconditional)  bases  for  the  spaces  of  interest.  Such 
a  library  of  bases  should  provide  an  effective  tool  for  nonlinear  n-term  approximation.  As 
in  this  paper,  we  conveniently  have  the  library  of  anisotropic  Haar  bases  {H-p}v  which  are 
unconditional  bases  for  {Lp(V)}-p  and  characterize  the  B“’i(T>)-spaces. 

An  open  problem  for  bases  is  to  construct  libraries  of  anisotropic  bases  consisting  of 
smooth  functions. 

Next,  we  pose  some  more  delicate  problems  about  the  library  of  anisotropic  Haar  bases  H: 
The  ultimate  problem  is  to  characterize  the  approximation  spaces  generated  by  {<7n(/)p}.  The 
difficulty  of  this  problem  stems  from  the  highly  nonlinear  nature  of  the  approximation  from 
the  library  EL  This  problem  is  intimately  connected  to  the  problem  for  existence  of  a  near 
best  (or  best)  basis:  For  a  given  function  f  G  Lp,  does  there  exist  a  single  Haar  basis 
Tlif)  G  H  such  that 

o:cn{f:'H{f))p  <  c  inf  an(f,  Tl)p  for  all  n  >  1  (c  =  constant)  ? 

^ge I 

The  answer  of  this  question  is  not  known  even  for  p  =  2.  If  the  answer  of  the  latter  question 
is  “Yes”,  then  the  approximation  of  any  /  G  Lp  from  the  library  of  anisotropic  Haar  bases 
H  could  be  realized  by  approximation  from  a  single  basis  H.{f)  and  characterized  by  the 
interpolation  spaces  generated  by  Bf(V*),  where  V*  is  determined  from  Hv*  =  'H(f)- 

7  Appendix 

Al.  Proof  of  Theorems  2.4  -  2.6. 
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For  the  proof  of  Theorem  2.5,  we  need  the  following  lemma: 

Lemma  7.1.  Suppose  {<Fm}  satisfies  conditions  (i)-(ii)  from  Theorem  2.5  and  p  >  1.  Let 
F  :=  Y2jejn  l^jl)  where  ffjn  <  n,  and  ||$j||p  <  A  for  j  G  Jn.  Then 

ll^llp  <  cAn y*> 


with  c  =  c(ci). 

Proof.  Using  (i),  we  have 

ii^ii,  <  ii  E  ii^img-iii,  <  mii  E  i^r1/P1®,(')iiP- 

jejn  jejn 


We  define 


E  :  =  Ej  and  X(x)  :=  min{|£,j|  :  j  G  Jn  and  Ej  9  x},  x  €  E. 


jejn 


Evidently,  property  (ii)  yields  J2jejn  \^j\  1/,pl Ej{%)  <  Ci\(x)  1/,p,  x  G  Rd .  Therefore, 


\\F\\p  <  cA||A(-)  1/p\\Lp  =eA(j^\{x)  1dx>j 

<  cA  (E  \Ej\~1  J d  tEj(x)  dx)1/P  =  cAfifiJn)ylv  <  c^nVp. 


□ 


jejn 


Proof  of  Theorem  2.5.  The  theorem  is  trivial  if  0  <  r  <  1.  Let  r  >  1.  Then  p  >  1.  Let 
{<F*}°L1  be  a  rearrangement  of  the  sequence  {$,-}  so  that  ||$*||p  >  >  •  •  •  •  Obviously, 

\\®*\\p<3-1/tM,  where  -V  :  (£  |  <D,-|  (7.1) 

j 

We  define  Jm  :=  {j  :  2~mU  <  <  2 Then  =  U  :  \\®i\\p  >  2“mW} 

and  hence,  using  (7.1), 

#77m<#(^^)<  2m\  (7.2) 

H<m 

We  denote  Fm  :=  J2jejm  l^jl-  Using  Lemma  7.1  and  (7.2),  we  obtain 

OO  OO  OO 

II  J]  |<M-)IIIP  <  E  IIF™Hp  ^  c^(#Jm)1^2— W  =  cfifY,  2-mT(1/T-1/p)  <  cAA  □ 

j  m= 0  m= 0  m=0 


Proof  of  Theorem  2.4.  Case  I:  1  <  p  <  oo.  We  introduce  the  following  abbreviated 
notation:  Tm  :  =  TTOiIJ(/),  tm  :=  iTOl„(/),  and  0  :=  1/  •  tm  if  /  G  Pm,  m  G  Z  (see  (2.9)).  By 
(2.17),  we  have 

v,,„(/,p)«(Eiwi;)1/r=^(/).  (Mi 

lev 
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(7.4) 


Clearly,  the  sequence  {ti}ie-p  satisfies  the  conditions  of  Theorem  2.5  and  hence 

II  5Z  I0(')  I  lip  <  cA7(/). 

jGZ 

We  define  g(x)  :=  T0(x)  +  ; (./•).  x  G  Rd.  By  (7.4),  <  oo  for  almost  all 

re  G  Rd  and  hence  g  is  well  defined.  Clearly,  g  :=  Tm  +  Yl<jLm.+ i^j  a-e-  011  f°r  each 
rn  G  Z,  with  the  series  converging  absolutely  a.e.  From  this  and  (7.4),  we  infer  || g  —  Tm\\p  < 
II  ■  i  M')lllp  ^  0  as  m  ->■  oo.  On  the  other  hand,  since  /  G  Lv,  \\f  -  Tm \\Lv(I)  ^  0  as 

rn  — >■  oo  for  each  I  G  V.  Therefore,  f  =  g  a.e.  and  hence 

OO 

/  —  Tm  =  X^  tj  a.e.  on  Rd,  m  G  Z,  (7.5) 

j  =m+ 1 

where  the  series  converges  absolutely  a.e.,  and  in  addition  to  this  /  G  Rp{V,  A:). 

We  shall  next  show  that  there  exists  a  polynomial  P  G  Ifi.  such  that 

m 

Tm  —  P  =  X]  tj  in  Loc(Wi),  m  G  Z.  (7.6) 

j=- oo 

Indeed,  using  Lemma  2.1  and  (7.4),  we  obtain 

ll*j|lfc»(/)  <  ^Ip-WhW M/>  <  c23/"||yM,)  <  C^'W),  /  e  Vi, 
and  hence  <  c2^pJ\f{f).  Therefore, 

m 

Y1  I  Ol  '•-**)  <  meZ.  (7.7) 

j  =  -  oo 

Fix  /  G  V.  If  —  m  is  sufficiently  large  and  //  <  —1,  then  Tm  —  Tm+p  is  an  algebraic  polynomial 
of  degree  <  k  on  I  and 

m  m 

\\Tm  ~  Tm+u\\Loa;(l)  =  \\  X/  ^IUooC)  <  X/  H^iUooC)  0  as  m  ^  -  oo, 

j=TO+/i+l  ./-'»•  •  /'  •  1 

where  we  used  (7.7).  Therefore,  there  exists  Q  /  G  fifi.  such  that 

lim  ||Tm  -  Qi\\Loo(i)  =  0. 

m — y — oo 

From  this  and  (7.7),  it  readily  follows  that  there  exists  a  unique  polynomial  P  G  II *  such 
that  limTO_!._00  ||Tm  -  P||Loo(M<i)  =  0.  This  and  (7.7)  imply  (7.6).  In  going  further,  (7.4)-(7.6) 
yield 

/  —  P  =  £  tj  a.e.  on  Rd  (7.8) 

with  the  series  converging  absolutely  a.e.,  and 

11/  -  -PIU  <  II  5C  I0(-)IIU  <  gM,„(/,7P)  c  oo.  (7,9) 

j&h 
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Now,  since  /  G  Lp(Rd )  and  /  —  P  G  Lp(Rd ),  then  P  =  0,  and  (7.8)-(7.9)  imply  Theorem  2.5 
in  Case  I. 

Case  II:  0  <  p  <  1.  Since  p  <  1  and  r / p  <  1,  we  immediately  obtain 

ii  E  mows  =  ii  E  Moiiij:  s  E  iwu  <  <E  \WT  5  cii/n«?- 

jez  /e-p  iev  lev 

This  replaces  (7.4)  and  everything  else  is  the  same  as  in  Case  1.  We  shall  skip  the  details. 

□ 


Proof  of  Theorem  2.6.  The  equivalence  of  A/L^(-,P)  and  can  be  proved  exactly 

as  Lemma  2.3  was  proved  and  we  skip  its  proof.  If  0  <  rj  <  r,  then  the  equivalence  of 
II  ’  | \i3?k(v)  and  Aft,ri{v'P)  follows  by  (2.14). 

The  estimate  ||/||Ba*(p)  <  Nu,n (/,  P),  for  r  <  r/  <  p,  is  immediate  by  applying  Holder’s 
inequality.  It  remains  to  prove  that,  for  /  G  B®k(V), 

?k(V)r  ^  T  <  Tj  <  p.  (7.10) 

Since  /  G  B°k(P),  by  Theorem  2.4  (with  7)  =  r),  /  can  be  represented  in  the  form 

f  =  ^tj  a.e.  on  Rd  (7.11) 

j'G^  /G"Pj 

with  the  series  converging  absolutely  a.e.,  where  P  G  n*,,  tj  :=  tj,T(/),  and  t/  :=  1/  •  tj,  if 
/  G  Pj,  and 

V, ,.(/,*>)'  =  ElG“l</t 

lev 

Evidently,  co'*:(t;.  J)r/  =  0  for  ./  G  Vm  and  j  <  m.  We  use  Lemma  2.1  to  obtain,  for 
./  G  Pm  and  j  >  m, 

uk{tj,  J)1  <  c||tj|ll,)(J)  <  c  ^  M<c  ^  ||tJ||?|/|"<H). 

leVj,  icj  /ePj,  /cj 

Set  A  :=  niin{//.  1}.  Using  (7.11),  we  have,  for  ./  G  Pm . 


<  ( E  ^•J)yx  <  c<  e  i  E  ii‘/ii;i^r(’"’’iv’')1/A' 

j=m+l  j=m+l  IeVj ,  ICJ 


Therefore, 


Jev 

oo 

<  TE  E  (  E  lloii;yr(--'>)A/"]T/A 

TnC^JC'Pm  1  /G”Pj,  ICJ 

oo 

=  cEEiEi  E  (iGiMioni/imi)(<-')T/T,A 

raG^  J£Vm  j=Tf%-\- 1  ICJ 


cE  E  [  E  <  E  '4?2-^-'">'y/TM  =:»EE  is 

TTlG^t/GT^rn  IC'Pj^ICJ  TTlC'^JC'Prr 


t/A 
ra.Jj  j 
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where  Aj  \=  \I\  “||t/||T  and  7  :=  a  +  - - =  —  -  >  0.  We  now  want  to  shift  the  order  of 

summation.  So,  this  is  a  Hardy  inequality  type  situation.  We  first  estimate  Smj  by  using 
Holder’s  inequality.  Choose  71,72  >  0  such  that  71  +  72  =  7  and  set  s  :=  77/ A,  1  /s'  :=  1  —  1/s. 
We  obtain 


9-7i(i-ndA2-72(i-m)A^  A7j)X^r‘ 


j=m+ 1 


ievh icj 


Y  (2“7l(i^m)A)s']1/s'[  Y  (2 


-72 


E  4+/,')‘ 


j=m+l 


j=m+l 


/e+,  JCJ 


<  =(  E  2 


-72  (j~m)p 


E  7+  <  c(  E  2 


“72  (j—m)r 


E  7)+ 


l=m+l 


/CPi,  ICJ 


j=m+ 1 


/cp,-,  /CJ 


where  we  used  that  r  <  rj.  Combining  this  result  with  the  previous  estimates,  we  obtain 


aw/.jt  <  <  EE  E  2 


E  4? 


JE.'Pm  j — Tfl-\- 1 


ieVj ,  icj 


<  cEE4?  E  2 

jCZ  /CP,  m=— 00 


—72 


<EE  A]  =  cNtAf,V)\ 


ieVj 


where  we  switched  the  order  of  summation.  Thus  (7.10)  is  proved. 

The  following  simple  example  shows  that  the  equivalence  of  ||  •  \\B«k(v)  and  A//.vJ?(-,  V)  is  not 
valid  if  r]  >  p.  Let  /  :=  1/  for  some  /  G  V.  It  is  readily  seen  that  ||/||.bE‘(p)  ~  \I\^P  ~  [|/||p 
and  at  the  same  time  +//,.,,(/.  P)  =  00  if  rj  >  p.  □ 


A2.  Proof  of  Theorem  3.4. 


We  first  prove  that,  for  /  G  S",  B “  :=  Brf;(V). 

II/IU;  <  C||/||B;.  (7.12) 

By  Theorem  2.6  and  (2.17),  ||/||  J.  «  +  ||i/||J  with  t,  :=  I,  ,.i  f  j  :=  1/  ■ ' . (/)  if  /  €  Vm 

(0  <  rj  <  p).  Therefore,  if  Ijt/Jlp  >  ||f/2||p  >  •  •  •  is  a  nonincreasing  rearrangement  of  the 
sequence  {||t/||p},  then 


E2' 


On  the  other  hand,  Theorem  2.4  implies 


<  oo) 


0m(f,'P)p  <  c||  Y  I lli 


i  in  ’  r 


Evidently,  the  sequence  satisfies  the  conditions  of  Theorem  2.5  and,  therefore,  we 

can  apply  Lemma  7.1  to  obtain 


■(/.V),,  <  cY  I  Philip  <  c  Y  2l/PWtl^  Up’  if  1  <  p  <  oo.  (7.13) 


j=v  1=27  +  1 
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Clearly, 


OO  OO 

V2»{f,V)pp<  IMp  <  if  0  <  p  <  1 .  (7.14) 

£=2"  +  l  j=v 

We  insert  (7.13)  or  (7.14),  respectively,  in  the  definition  of  ||/|U“  (see  (3.6))  and  apply 
inequality  (2.12)  to  obtain  (7.12). 

We  next  prove  that  if  /  <G  .4".  then  /  G  B*  and 

Mb?  <  4f\M  (7.15) 

Case  I:  r  <  1.  We  may  assume  that  c pm  E  E^("P)  are  such  that  ||/  —  pm\\p  =  crm{f?'P)p- 
Since  f  E  A®(Lp,P),  then  <rrn{f,V)p  —>  0  and  hence 

OO 

f  ='<pi  +  -  ^2-1 )  in  Lp.  (7.16) 

15=1 

On  the  other  hand,  since  [|</v  —  W-1!^  <  ccr^-i  (f)p, 

OO  OO 

IIICll  +  ^  ““  W2--1  |  lip  <  \\f\\p  +  11/  -  fillip  +  11^2"  -  ^2"-1||p 

15=1  15=1 

OO  OO 

<  ll/ll £+*!>■</, ?>)J  <  ||/|i; +<£>.</, P);  <  c||/||y  <  «> 

15  =  0  15  =  0 

with  //  :=  min{p,  1},  where  we  used  that  r  <  //,.  Therefore,  the  series  in  (7.16)  converges 
absolutely  a.e.  on  Rd  as  well.  From  this,  we  readily  obtain  (r  <  1) 

OO 

H/||b?  —  \Mb«  +  ^  ||W  —  V?2"-1  \\tb?- 

15  =  1 

Applying  the  Bernstein  inequality  from  Theorem  3.2  to  each  term  above,  we  get 

OO 

Ml?  <  + 

V=1 

OO 

<  c\\f\\;  +  cY,(r°a2r(f,p)py<c\\mr 

15  =  0 

This  completes  the  proof  of  (7.15)  in  Case  I. 

Case  II:  r  >  1.  Then  p  >  1.  This  case  is  more  complicated  and  will  require  more  careful 
analysis.  We  may  assume  that  pm  E  T,^n(V)  are  such  that  ||/  —  ipm\\p  =  o-m(f,V)p.  Let. 

Cm  =:  lj  '  P'm-U  where  C  V.  #Am  <  rn.  and  PmJ  E  Uk. 

ieAm 

Set  A^  :=  Uj=oA2i-  We  have 

V 

A*.  C  A;,.+1  and  #A*„  <  ^  2j  =  2"+1  -  1  for  v  =  1,  2, . . .. 

j=o 
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In  this  part,  our  construction  is  quite  similar  to  the  one  from  the  proof  of  Theorem  3.2.  Let 
o  G  V  be  the  smallest  box  containing  all  boxes  from  AT  and  let  TJ  be  the  minimal  binary 
subtree  of  V  containing  AT  U  {7„i0}.  The  set  AT  induces  a  natural  subdivision  of  Rd  into 
a  union  of  disjoint  maximal  rings.  By  definition,  77  is  a  ring  if  77  =  7  \  J,  where  7  G  V  or 
7  =  Rd  and  J  £  V  or  J  =  0.  We  say  that  R  =  I  \  J  is  a  maximal  ring  generated  by  A|*.  if 
(a)  /  G  77  or  /  =  Rd  and  J  G  7/  or  J  =  0,  (b)  7?  does  not  contain  a  box  smaller  than  / 
from  AT,  and  (c)  77  is  maximal  with  these  two  properties.  We  let  p*  denote  the  set  of  all 
maximal  rings  generated  by  AT-  We  have  the  following  possibilities  for  a  ring  R  G  p*  with 
R  =:  7  \  .7:  (i)  7  is  a  final  box  in  T*  and  .7  =  0;  (ii)  J  G  AT  or  .7  is  a  branching  box  in  7p: 
(iii)  7  =  Rd  and  J  =  7„i0.  Therefore,  #p*  <  3#AT  +  1  <  6  •  2".  Note  that  p*  is  a  collection 
of  disjoint  rings  such  that 

Rd  =  (J  77. 

R£pt 

Also,  since  AT  C  AT+i,  for  each  77  G  p^+1 ,  we  have  either  77  G  p*  or  7?  C  7\  for  some 
K  G  p*.  Thus  { p* }  is  a  sequence  of  nested  rings. 

For  each  ring  77  G  p*,  we  denote  by  Ir  (the  mother  box  of  77)  the  smallest  box  from  V 
containing  7?  and  by  I'R  the  largest  box  from  V  contained  in  77.  Note  that  I'R  is  uniquely 
determined  and  is  one  of  the  two  children  of  Ir  in  V .  Also,  we  define  Pr  G  L^  by  the  identity 

11/  -  =  i"f  11/  -  -Plkl/y  =:  Ek(j,  I'R)„. 

life 

It  is  easily  seen  (using  Lemma  2.1)  that 

\\f  —  Pr\\lp(r)  <  cEk(f,R)p.  (7.17) 

Now,  we  set  pT  :=  Ir  ‘  Pr-  It  follows,  from  A2^  C  AT  and  (7.17), 

11/  -  f^TII P  <  c|| /  -  ^2-  ||p  =  aj2v{f,V)p.  (7.18) 

By  the  definition  of  pT  •  if  77  G  p*  and  7i  G  p*_i  with  7#  =  IK,  then  77  C  7i  and  pT  =  pT-i 
on  77.  We  let  p*  (r  >  1)  denote  the  set  of  all  rings  from  p*  \p*_i  which  do  not  share  mother 
boxes  with  rings  from  p*_i  and  set  p))  :=  p*Q.  Note  that  p*  is  a  collection  of  disjoint  rings. 
From  the  above  arguments,  every  two  sets  from  the  sequence  {pl}™=  0  are  disjoint  and 

pT  —  pT-i  '=  ^  Ir  ‘  Pr  ='■  ^  v>l.  (7.19) 

Rtpi  R£pt 


Note  that,  using  (7.18), 

II  ^  $«ll =  Wti"  ~  ^~A\r  <  ca2^i(/,P)p,  v>l.  (7.20) 

Rtpi 

Let  77  G  U^L0p)j  and  77  =:  7  \  .7  with  7  G  Vf  and  J  G  Ve+n  for  some  f  G  Z  and  p  >  1.  For 
7  <  m  <  l  +  p.  there  is  a  unique  P  G  Vm.  such  that  J  C  P  C  7.  We  define  <E>#iTO  :=  lj|  •  T  /,. 
and  set  $R,m  :=  0  if  m  <  i  or  m  >  I  +  p. 
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Since  \\f  —  ||p  <  c\\f  —  </v  ||p  — >■  0  and  /  G  A^(LP,V),  similarly  as  in  Case  I  (see  (7.16)) 

we  have 


=  ¥>i 


+  E^'  “  ‘/BO  “*  Lt 


(7.21) 


17=1 


with  the  series  converging  absolutely  almost  everywhere  as  well. 

We  denote  by  7 Zm  the  set  of  all  rings  R  G  p°  :=  U such  that  IR  G  Vm  and  let  JC 
be  the  set  of  all  rings  R  G  with  R  =:  I  \  J  such  that  \J\  <  2~m  <  |/|.  Clearly,  7Zrn  U  /C 
is  a  set  of  disjoint  rings.  From  this,  (7.19),  and  (7.21),  it  readily  follows  that  (r  >  1) 


E  <  4  E  ii  E  «*uT+4i  E  4 

p—Tn-\- 1  ReV-/x  Re)C 


r 

R,m  T 


lev \ 


=  4  E  <  E  n^ipir+c  E  ii4*.™it- 

H=m+ 1  ReTZ/j.  ReKm 


Therefore, 


B2 


=  E2“”*TE 

mGZ  ieVm 

oo 

<  cEj2™  E  <E  ii^bT+cE2”’”'  E  ii'itoii;  =  +  s2. 

raGZ  p,—Tn-\- 1  ReV-fx  mez  ReKlrn 

We  apply  inequality  (2.12)  to  the  first  sum  above  to  obtain 

Sr  <  c]T  2Qmr  5]  ll^ll;  <  C  5]  Harelip, 
mGZ  ReV'm  Rep* 

where  we  used  that  ||$B||T  <  |/r|1/,t_1/,p||<4).r||},  =  2_cmij|<FK||p,  R  G  77.m,  by  Holder’s  inequal¬ 
ity. 

We  shall  estimate  E2  by  using  the  inequalities:  (a)  ||$BiTO||t  <  2~am\\^  R^m\\p  which  follows 
by  Holder’s  inequality  as  above,  and  (b)  J2meZ  ||$.R,TO||p  <  c||<S>^||p,  R  G  p°,  which  can  be 
proved  exactly  as  (3.5)  was  proved.  Applying  these  inequalities,  we  find 

s2  <«E  E  n**-K  <  «  E  E  n**-K  <  «  E  ii<c«ii;, 

j ReK.m  Rep°  rn&  Rep° 

where  we  switched  the  order  of  summation. 

Combining  the  above  estimates  for  Ei  and  E2,  we  obtain 


?  <  cEll^KscEEll4 

Rep «  ;7=o  Rept 


<  ^E(E  ii4*iij)t/’,(#a.)w'<4iwii;+cE2 

(7=o  Re  pi 


volt  |  r 

TV  TV"1  i) 


(7  =  1 


<  41/It 


+cE!'*v(//i;< 


U?  j 


(7  =  0 
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where  we  used  (7.20)  and  Holder’s  inequality.  This  completes  the  proof  of  (7.15)  in  Case  II. 

□ 
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